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The research carried out in the subject application was supported in part by 
grants from the National Institutes of Health. The government may have rights in 
any patent issuing on this application. 



Technical Field 

The technical field of this invention concerns peptides, polypeptides, and 
polynucleotides involved in nerve cell growth. 

Background 

The specificity of the wiring of the nervous system — the complex pattern 
of specific synaptic connections — begins to unfold during development as the 
growing tips of neurons - the growth cones - traverse long distances to find their 
correct targets. Along their journey, they are confronted by and correctly navigate 
a series of choice points in a remarkably unerring way to ultimately contact and 
recognize their correct target. 

The identification of growth cone guidance cues is to a large extent, the 
holy grail of neurobiology. These are the compounds that tell neurons when to 
grow, where to grow, and when to stop growing. The medical applications of 
such compounds and their antagonists are enormous and include modulating 
neuronal growth regenerative capacity, treating neurodegenerative disease, and 
mapping (e.g. diagnosing) genetic neurological defects. 

Over decades of concentrated research, various hypotheses of chemo- 
attractants and repellant, labeled pathways, cell adhesion molecules, etc. have been 
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evoked to explS guidance. Recently, several recenSes of experiments suggest 
repulsion may play an important role in neuron guidance and two apparently 
unrelated factors ("Neurite Growth Inhibitor" and "Collapsin") capable of 
inhibiting or collapsing growth cones have been reported. 

5 

Relevant Literafiire 

For a recent review of much of the literature in this field, see Goodman and 
Shatz (1993) Cell 72/Neuron 10, 77-98. A description of grasshopper fasciclin IV 
(now called G-Semaphorin I) appears in Kolodkin et al. (1992) Neuron 9, 831-845. 
Recent reports on Collapsin and Neurite Growth Inhibitor include Raper and 
Kapfhammer (1990) Neuron 4, 21-29, an abstract presented by Raper at the 
GIBCO-BRL Symposium on "Genes and Development/Function of Brain" on July 
26, 1993 and Schwab and Caroni (1988) J Neurosci 8, 2381 and Schnell and 
Schwab (1990) Nature 343, 269, respectively. 
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SUMMARY ClV T HE TNVFNTTTnM 
A novel class of proteins, semaphores, nucleic acids encoding 
semaphores, and methods of using semaphores and semaphorin-encoding nucleic 
acids are disclosed. Semaphores eclude the first known family of human proteins 
which function as growth cone ehibitors and a family of proteins involved in viral, 
particularly pox viral, pathogenesis and oncogenesis. Families of semaphorin- 
specific receptors, ecluding receptors found on nerve growth cones and immune 
cells are also disclosed. 

The invention provides agents, including semaphore peptides, which 
25 specifically bind semaphore receptors and agents, including semaphore receptor 
peptides, which specifically bed semaphores. These agents provide potent 
modulators of nerve cell growth, immune responsiveness and veal pathogenesis 
and find use in the treatment and diagnosis of neurological disease and neuro- 
regeneration, immune modulation ecluding hypersensitivity and graft-rejection, 
30 and diagnosis and treatment of viral and oncological infection/diseases. 

Semaphores, semaphore receptors, semaphorin-encodeg nucleic acids, 
and unique portions thereof also find use variously in screening chemical libraries 
for regulators of semaphore or semaphore receptor-mediated cell activity, in 
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genetic mapping, li^obes for related genes, as diagnostic reagents for genetic 
neurological, immunological and oncological disease and in the production of 
specific cellular and animal systems for the development of neurological, 
immunological, oncological and viral disease therapy. 

5 

DESCRIPTION OF SPECIFIC EMBODIMENTS 
The present invention discloses novel families of proteins important in nerve 
and immune cell function: the semaphores and the semaphorin receptors. The 
invention provides agents, including semaphorin peptides, which specifically bind 
10 semaphorin receptors and agents, including semaphorin receptor peptides, which 
specifically bind semaphores. These agents find a wide variety of clinical, 
therapeutic and research uses, especially agents which modulate nerve and/or 
immune cell function by specifically mimicing or interfering with semaphorin- 
receptor binding. For example, selected semaphorin peptides shown to act as 
15 semaphorin receptor antagonists are effective by competitively inhibiting native 
semaphorin association with cellular receptors. Thus, depending on the targeted 
receptor, these agents can be used to block semaphorin mediated neural cell growth 
cone repulsion or contact inhibition. Such agents find broad clinical application 
where nerve cell growth is indicated, e.g. traumatic injury to nerve cells, 
20 neurodegenerative disease, etc. A wide variety of semaphorin- and semaphorin 
receptor- specific binding agents and methods for identifying, making and using the 
same are described below. 

Binding agents of particular interest are semaphorin peptides which 
specifically bind and antagonize a semaphorin receptor and semaphorin receptor 
25 peptides which specifically bind a semaphorin and prevent binding to a native 
receptor. While exemplified primarily with semaphorin peptides, much of the 
following description applies analogously to semaphorin receptor peptides. 

The semaphorin peptides of the invention comprise a unique portion of a 
semaphorin and have semaphorin binding specificity. A "unique portion" of a 
30 semaphorin has an amino acid sequence unique to that disclosed in that it is not 
found in any previously known protein. Thus a unique portion has an amino acid 
sequence length at least long enough to define a novel peptide. Unique 
semaphorin portions are found to vary from about 5 to about 25 residues, 
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to 10 residues in length, dependingWthe particular amino acid 
sequence. Unique semaphorin portions are readily identified by comparing the 
subject semaphorin portion sequences with known peptide/protein sequence data 
bases. Preferred unique portions derive from the semaphorin domains (which 
5 exclude the Ig-like, intracellular and transmembrane domains as well as the signal 
sequences) of the disclosed semaphorin sequences, especially regions that bind the 
semaphorin receptor, especially that of the human varieties. Preferred semaphorin 
receptor unique portions derive from the semaphorin binding domains, especially 
regions with residues which contact the semaphorin ligand, especially that of the 
10 human varieties. Particular preferred peptides are further described herein. 

The subject peptides may be free or coupled to other atoms or molecules. 
Frequently the peptides are present as a portion of a larger polypeptide comprising 
the subject peptide where the remainder of the polypeptide need not be semaphorin- 
or semaphorin receptor-derived. Alternatively, the subject peptide may be present 
15 as a portion of a "substantially full-length" semaphorin domain or semaphorin 
receptor sequence which comprises or encodes at least about 200, preferably at 
least about 250, more preferably at least about 300 amino acids of a disclosed 
semaphorin/receptor sequence. Thus the invention also provides polypeptides 
comprising a sequence substantially similar to that of a substantially full-length 
20 semaphorin domain or a semaphorin receptor. "Substantially similar" sequences 
share at least about 40%, more preferably at least about 60%, and most preferably 
at least about 80% sequence identity. Where the sequences diverge, the 
differences are generally point insertions/deletions or conservative substitutions, 
i.e. a cysteine/threonine or serine substitution, an acidic/acidic or 
25 hydrophobic/hydrophobic amino acid substitution, etc. 

The subject semaphorin peptides/polypeptides are "isolated", meaning 
unaccompanied by at least some of the material with which they are associated in 
their natural state. Generally, an isolated peptide/polypeptide constitutes at least 
about 1%, preferably at least about 10%, and more preferably at least about 50% 
30 by weight of the total peptide/protein in a given sample. By pure 

peptide/polypeptide is intended at least about 90% , preferably at least 95%, and 
more preferably at least about 99% by weight of total peptide/protein. Included in 
the subject peptide/polypeptide weight are any atoms, molecules, groups, or 
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peptide/polypeptide, especially peptides, proteins, detectable labels, glycosylates, 
phosphorylations, etc. 

The subject peptides/polypeptides may be isolated or purified in a variety of 
ways known to those skilled in the art depending on what other components are 
present in the sample and to what, if anything, the peptide/polypeptide is 
covalently linked. Purification methods include electrophoretic, molecular, 
immunological and chromatographic techniques, especially affinity chromatography 
and RP-HPLC in the case peptides. For general guidance in suitable purification 
techniques, see Scopes, R., Protein Purification, Springer- Verlag, NY (1982). 

The subject peptides/polypeptides generally comprise naturally occurring 
amino acids but D-amino acids or amino acid mimetics coupled by peptide bonds 
or peptide bond mimetics may also be used. Amino acid mimetics are other than 
naturally occurring amino acids that conformationally mimic the amino acid for the 
purpose of the requisite semaphorin/receptor binding specificity. Suitable mimetics 
are known to those of ordinary skill in the art and include (3-y-5 amino and imino 
acids, cyclohexylalanine, adamantylacetic acid, etc., modifications of the amide 
nitrogen, the a-carbon, amide carbonyl, backbone modifications, etc. See, 
generally, Morgan and Gainor (1989) Ann. Repts. Med. Chem 24, 243-252; 
Spatola (1983) Chemistry and Biochemistry of Amino Acids, Peptides and 
Proteins, Vol VII (Weinstein) and Cho et. al (1993) Science 261, 1303-1305 for 
the synthesis and screening of oligocarbamates. 

The subject semaphorin peptides/polypeptides have a "semaphorin binding 
specificity" meaning that the subject peptide/polypeptide retains a molecular 
conformation specific to one or more of the disclosed semaphorins and specifically 
recognizable by a semaphorin-specific receptor, antibody, etc. As such, a 
semaphorin binding specificity may be provided by a semaphorin-specific 
immunological epitope, lectin binding site, etc., and preferably, a receptor binding 
site. Analogously, the semaphorin receptor peptides/polypeptides have a 
"semaphorin receptor binding specificity" meaning that these peptides/polypeptides 
retain a molecular conformation specific to one or more of the disclosed 
semaphorin receptors and specifically recognizable by a semaphorin, a receptor- 
specific antibody, etc. 
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" Specilic binding " is empirically determined ^P^ontacting, for example a 
semaphorin-derived peptide with a mixture of components and identifying those 
components that preferentially bind the semaphorin. Specific binding is most 
conveniently shown by competition with labeled ligand using recombinant 
5 semaphorin peptide either in vitro or in cellular expression systems as disclosed 
herein. Generally, specific binding of the subject semaphorin has binding affinity 
of 10*M, preferably 10~ 8 M, more preferably 10* 10 M, under in vitro conditions as 
exemplified below. 

The peptides/polypeptides may be modified or joined to other compounds 

10 using physical, chemical, and molecular techniques disclosed or cited herein or 
otherwise known to those skilled in the relevant art to affect their semaphorin 
binding specificity or other properties such as solubility, membrane 
transportability, stability, binding specificity and affinity, chemical reactivity, 
toxicity, bioavailability, localization, detectability, in vivo half-life, etc. as assayed 

15 by methods disclosed herein or otherwise known to those of ordinary skill in the 
art. For example, point mutations are introduced by site directed mutagenesis of 
nucleotides in the DNA encoding the disclosed semaphorin polypeptides or in the 
course of in vitro peptide synthesis. 

Other modifications to further modulate binding specificity/affinity include 

20 chemical/enzymatic intervention (e.g. fatty acid-acylation, proteolysis, 

glycosylation) and especially where the peptide/polypeptide is integrated into a 
larger polypeptide, selection of a particular expression host, etc. In particular, 
many of the disclosed semaphorin peptides contain serine and threonine residues 
which are phosphorylated or dephosphorylated. See e.g. methods disclosed in 

25 Roberts et al. (1991) Science 253, 1022-1026 and in Wegner et al. (1992) Science 
256, 370-373. Amino and/or carboxyl termini may be functionalized e.g., for the 
amino group, acylation or alkylation, and for the carboxyl group, esterification or 
amidification, or the like. Many of the disclosed semaphorin peptides/polypeptides 
also contain glycosylation sites and patterns which may disrupted or modified, e.g. 

30 by enzymes like glycosidases or used to purify /identify the receptor, e.g. with 
lectins. For instance, N or O-linked glycosylation sites of the disclosed 
semaphorin peptides may be deleted or substituted for by another basic amino acid 
such as Lys or His for N-linked glycosylation alterations, or deletions or polar 



BN8DOCID: <WO_ee07708A1Jj> 



WO 95/07706 ^ PCT/US94/10151 

substitutions are imJIuced at Ser and Thr residues for mSating O-linked 
glycosylation. Glycosylation variants are also produced by selecting appropriate 
host cells, e.g. yeast, insect, or various mammalian cells, or by in vitro methods 
such as neuraminidase digestion. Useful expression systems include COS-7, 293, 

5 BHK, CHO, TM4, CV1, VERO-76, HELA, MDCK, BRL 3A, W138, Hep G2, 
MMT 060562, TRI cells, baculovirus systems, for examples. Other covalent 
modifications of the disclosed semaphorin peptides/polypeptides may be introduced 
by reacting the targeted amino acid residues with an organic derivatizing (e.g. 
methyl-3-[(p-azido-phenyl)dithio] propioimidate) or crosslinking agent (e.g. 1,1- 

10 bis(diazoacetyl)-2-phenylethane) capable of reacting with selected side chains or 
termini. For therapeutic and diagnostic localization, semaphores and peptides 
thereof may be labeled directly (radioisotopes, fluoresces, etc.) or indirectly with 
an agent capable of providing a detectable signal, for example, a heart muscle 
kinase labeling site. 

15 The following are 14 classes of preferred semaphorin peptides where 

bracketed positions may be occupied by any one of the residues contained in the 
brackets and "X" signifies that the position may be occupied by any one of the 20 
naturally encoded amino acids. These enumerated peptides maintain highly 
conserved structures which provide important semaphorin binding specificities; 

20 

(a) [DE]C[QKRAN]N[YFV] I (SEQ ID NO:01) 

C[QKRAN]N [YFV] I [RKQT] (SEQ ID NO: 02) 
25 (b) CGT[NG] [ASN] [YFHG] [KRHNQ] (SEQ ID NO: 03) 
CGT[NG] [ASN]XXP (SEQ ID NO:04) 
CGT[NG]XXXPX[CD] (SEQ ID NO: 05) 
CGTXXXXPX[CD]XX[YI] (SEQ ID NO: 06) 

(c) [RIQV] [GA] [LVK][CS]P[FY] [DN] (SEQ ID NO:07) 
35 [ CS ] P [ FY ] [ DN ] P [ DERK ] [ HLD ] (SEQ ID NO:08) 

GX[GA]X[CS]PY[DN]P (SEQ ID NO:09) 

(d) L [ FY ] S [ G A ] T [ VNA ] A (SEQ ID NO:10) 
L [ FY ] SXTXA [ DE ] [FY] (SEQ ID NO:ll) 
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CFY]S7GA]T[VNA]A[DE] [FY] (SEQ IlPfo:12) 
(e) L[ND] [AKJPNFV (SEQ ID NO: 13) 
5 (f) FFFRE (SEQ ID NO: 14) 

FF[FY]RE[TN] (SEQ ID NO: 15) 
FFRE [TN] A (SEQ ID NO: 16) 
F[FY]RE[TN]A (SEQ ID NO: 17) 
YFF[FY]RE (SEQ ID NO: 18) 
[FY] FF[FY]RE (SEQ ID NO: 19) 
[FY] [FY] [FY]RE[TN]A (SEQ ID NO:20) 
[IV] [FY]F[FY] [FY]RE (SEQ ID NO:21) 
D[KFY]V[FY] [FYIL] [FYIL] [FY] (SEQ ID NO:22) 
[VI] [FY] [FYIL] [FYIL]F[RT]X[TN] (SEQ ID NO:23) 
[VI][FY][FYIL][FYIL][FY][RT][EDV][TN] (SEQ ID NO:24) 

(g) E[FY]IN[CS]GK (SEQ ID NO:25) 
[FY]INCGK[AVI] (SEQ ID NO: 26) 

(h) R[VI] [AG] [RQ] [VI]CK (SEQ ID NO:27) 
R[VI]X[RQ] [VI]CXXD (SEQ ID NO:28) 
GK[VAI]XXXR[VAI]XXXCK (SEQ ID NO: 29) 

(i) [RKN]W[TAS] [TAS] [FYL]L[KR] (SEQ ID NO: 30) 
[FY]L[KR] [AS]RL[NI]C (SEQ ID NO:31) 
[NI]CS[IV] [PS]G (SEQ ID NO:32) 
W[TAS][TAS][FYL]LK[ASVIL]XL (SEQ ID NO: 33) 
W[TAS] [TAS]XLKXXLXC (SEQ ID NO:34) 
WX[TS]XLKXXLXC (SEQ ID NO: 35) 

(j) [FY] [FY] [ND]EIQS (SEQ ID NO:36) 

[FY]P[FY] [FY] [FY] [ND]E (SEQ ID NO:37) 
(k) GSA[VIL]CX[FY] (SEQ ID NO:38) 

SA[VIL]CX[FY]XM (SEQ ID NO:39) 
(1) NS[NA]WL[PA]V (SEQ ID NO:40) 
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(m) [VLI]P[iBWsF]PRPG (SEQ ID NO:41) 
[VLI]PXP[RA]PGXC (SEQ ID NO: 42) 
P[EDYSF]PRPG[TQS]C (SEQ ID NO:43) 
(n) DP[HFY]C[AG]W (SEQ ID NO:44) 
P[HFY]C[AG]WD (SEQ .ID NO:45) 
DPXC[AG]WD (SEQ ID NO: 46) 
CXXXXDPXCXWD (SEQ ID NO: 47) 
CXXXDPXCXWD (SEQ ID NO: 48) 
CXXDPXCXWD (SEQ ID NO: 49) 
CXXCXXXXDXXCXWD (SEQ ID NO: 50) 
CXXCXXXDXXCXWD (SEQ ID NO: 51) 
CXXCXXDXXCXWD (SEQ ID NO: 52) 



25 



The following peptides represent particularly preferred members of each 





class: 








(a) 


DCQNYI (subset of SEQ ID NO: 01) 






(b) 


CGTfNG] [AS1XXP (subset of SEQ ID NO:04) 


30 








(c) 


GX[SC]PYDP (subset of SEQ ID NO: 09) 






(d) 


LYSGT[VNA] A (subset of SEQ ID NO: 10) 




35 


(e) 


LNAPNFV (subset of SEQ ID NO: 13) 






(f) 


[ FY] FF [FY ]RE (SEQ ID NO:19) 






(g) 


E[FY]IN[CS]GK (SEQ ID NO:25) 




40 








(h) 


R[VI]ARVCK (SEQ ID NO: 27) 






(i) 


W[TA] [TS] [FY]LK[AS]RL (subset of SEQ 


ID NO: 33) 


45 


(j) 


PFYF [ND] EIQS (subset of SEQ ID NO: 36) 






(k) 


GSAVCX[FY] (subset of SEQ ID NO: 38) 






(1) 


NSNWL[PA]V (subset of SEQ ID NO: 40) 




50 






(m) 


P[ED]PRPG[TQS]C (subset of SEQ ID NO: 


43) 




(n) 


DPYC[AG]WD (subset Of SEQ ID NO: 46) 
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The following 14 classes are preferred peptide^'hich exclude semaphorin 
peptides encoded in open reading frames of Variola major or Vaccinia viruses. 

(a) [ DE ] C [ QKRAN ] N [ YFV] I (SEQ ID NO:01) 

5 C [QKRAN] N[ YFV] I [RKQT] (SEQ ID NO: 02) 

(b) CGT[NG] [AS] [YFHG] [KRHNQ] (SEQ ID NO:03) 
CGT[NG] [ASN] [YFH] [KRHNQ] (SEQ ID NO:03) 
CGT[NG] [AS]XXP (SEQ ID NO: 04) 

(c) [RIQV] [GA] [LVK] [CS]P[FY] [DN] (SEQ ID NO: 07) 
15 [CS]P[FY] [DN]P[DERK] [HLD] (SEQ ID NO:08) 

GX[GA]X[CS]PY[DN]P (SEQ ID NO:09) 

(d) L[FY] S[GA]T[VNA] A (SEQ ID NO:10) 

20 

L [ FY ] SXTXA [ DE ] [ FY ] (SEQ ID NO: 11) 
[FY] S [GA]T[VNA]A[DE] [FY] (SEQ ID NO:12) 
25 (e) L[ND] [AK]PNFV (SEQ ID NO:13) 
(f) FFFRE (SEQ ID NO: 14) 

FF [ FY ] RE [ TN ] (SEQ ID NO: 15) 

30 

FFRE [ TN ] A (SEQ ID NO: 16) 
F [ FY ] RE [ TN ] A (SEQ ID NO:17) 
35 YFF[FY]RE (SEQ ID NO: 18) 

[FY]FF[FY]RE (SEQ ID NO:19) 

[FY] [FY] [FY]RE[TN]A (SEQ ID NO:20) 

40 

[IV] [FY]F[FY] [FY]RE (SEQ ID NO:21) 
D[KFY]V[FY] [FYL] [FYIL] [FY] (SEQ ID NO:22) 

45 D[KFY]V[FY] [FYIL] [FYI] [FY] (SEQ ID NO:22) 

[VI] [FY] [FYL] [FYIL]F[RT]X[TN] (SEQ ID NO: 23) 

5Q [VI] [FY] [FYIL] [FYI]F[RT]X[TN] (SEQ ID NO:23) 

[VI] [FY] [FYIL] [FYIL]FRX[TN] (SEQ ID NO:23) 
[VI] [FY] [FYL] [FYIL] [FY] [RT] [EDV] [TN] (SEQ ID NO: 24) 

55 (g) E[FY]IN[CS]GK (SEQ ID NO:25) 
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[FY]INCGPpWT] (SEQ ID NO: 26) 

(h) R[VI] [AG] [RQ] [VI]CK (SEQ ID NO:27) 
R[VI]X[RQ] [VI]CXXD (SEQ ID NO: 28) 
GK[VAI]XXXR[VAI]XXXCK (SEQ ID NO: 29) 

(i) [RKN]W[TA] [TAS] [FYL]L[KR] (SEQ ID NO:30) 
[FY]L[KR] [AS]RL[NI]C (SEQ ID NO:31) 
[NI]CS[IV] [PS]G (SEQ ID NO:32) 

15 W[TA] [TAS] [ FYL] LK [ ASVIL] XL (SEQ ID NO:33) 

W[TAS] [TAS] [FYL] LK [ASIL] XL (SEQ ID NO:34) 
W[TA] [TAS]XLKXXLXC (SEQ ID NO:35) 

20 

(j) [FY] [FY] [ND]EIQS (SEQ ID NO:36) 

[FY]P[FY] [FY] [FY] [ND]E (SEQ ID NO:37) 
25 (k) GSA[VIL]CX[FY] (SEQ ID NO: 38) 
SA[VI]CX[FY]XM (SEQ ID NO:39) 
(1) NS[NA]WL[PA]V (SEQ ID NO:40) 

30 

(m) [VLI]P[EDYSF]PRPG (SEQ ID NO: 41) 
[VLI]PXPRPGXC (SEQ ID NO: 42) 
35 P[EDYSF]PRPG[TQS]C (SEQ ID NO:43) 

(n) DP[HFY]C[AG]W (SEQ ID NO:44) 
P[HFY]C[AG]WD (SEQ ID NO: 45) 

40 

DPXC[AG]WD (SEQ ID NO: 46) 
CXXXXDPXCXWD (SEQ ID NO: 47) 
45 CXXXDPXCXWD (SEQ ID NO: 48) 

CXXDPXCXWD (SEQ ID NO: 49) 
CXXCXXXXDXXCXWD (SEQ ID NO: 50) 
CXXCXXXDXXCXWD (SEQ ID NO: 51) 
CXXCXXDXXCXWD (SEQ ID NO: 52) 
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The folrowing 2 classes are preferred peptideWmich exclude semaphorin 
peptides encoded in open reading frames of Variola major or Vaccinia viruses 
Grasshopper Semaphorin I. 
(f) YFF[FY]RE (SEQ ID NO: 14) 

5 

D[KY]V[FY] [FYL] [FYIL] [FY] (SEQ ID NO: 22) 
D[KY]V[FY] [FYIL] [FYI] [FY] (SEQ ID NO: 22) 
10 [ VI ] Y [ FYL ] [ FYIL ] F [ RT ] X [ TN ] (SEQ ID NO:23) 

[VI]Y[FYIL] [ FYI ] F [RT] X [TN] (SEQ ID NO:23) 
[VI]Y[FYIL] [ FYIL] FRX [TN] (SEQ ID NO:23) 

15 

V[FY] [FYL] [FYIL] [FY] [RT] [EDV] [TN] (SEQ ID NO:24) 
V[FY] [FYIL] [FYI] [FY] [RT] [EDV] [TN] (SEQ ID NO:24) 
20 V[FY] [FYIL] [FYIL] [FY]R[EDV] [TN] (SEQ ID NO:24) 

(n) CXXXDPXCXWD (SEQ ID NO: 48) 
CXXDPXCXWD (SEQ ID NO: 49) 

25 

CXXCXXXDXXCXWD (SEQ ID NO: 51) 
CXXCXXDXXCXWD (SEQ ID NO: 52) 

30 The following 5 classes are peptides which encompass peptides encoded in 

open reading frames of Variola major or Vaccinia viruses. Accordingly, in the 
event that these viral peptides are not novel per se, the present invention discloses 
a hitherto unforseen and unforseeable utility for these peptides as 
immunosuppressants and targets of anti-viral therapy. 

35 (b) CGT[NG] [ASN] [YFHG] [KRHNQ] (SEQ ID NO:03) 

CGT [ NG ] [ ASN ] XXP (SEQ ID NO: 04) 
CGT[NG]XXXPX[CD] (SEQ ID NO: 05) 

40 

CGTXXXXPX[CD]XX[YI] (SEQ ID NO: 06) 
(f) D[KFY]V[FY] [FYIL] [FYIL] [FY] (SEQ ID NO:22) 
45 [VI] [FY] [FYIL] [FYIL]F[RT]X[TN] (SEQ ID NO:23) 

V[FY] [FYIL] [FYIL] [FY] [RT] [EDV] [TN] (SEQ ID NO: 24 ) 
(i) [RKN]W[TAS] [TAS] [FYL]L[KR] (SEQ ID NO:30) 



50 
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W[TAS] [T 



[FYL] LK[ASVIL] XL (SEQ ID 




33) 



W[TAS] [TAS]XLKXXLXC (SEQ ID NO: 34) 



5 



WX[TS]XLKXXLXC (SEQ ID NO: 35) 



(k) SA[VIL]CX[FY]XM (SEQ ID NO:39) 
(m) [VLI]PXP[RA]PGXC (SEQ ID NO:42) 

10 

The disclosed semaphorin sequence data are used to define a wide variety of 
other semaphorin- and semaphorin receptor-specific binding agents using 
immunologic, chromatographic or synthetic methods available to those skilled in 
the art. 

15 Of particular significance are peptides comprising unique portions of 

semaphorin-specific receptors and polypeptides comprising a sequence substantially 
similar to that of a substantially full-length semaphorin receptor. Using 
semaphorin peptides, these receptors are identified by a variety of techniques 
known to those skilled in the art where a ligand to the target receptor is known, 

20 including expression cloning as set out in the exemplification below. For other 
examples of receptor isolation with known ligand using expression cloning, see, 
Staunton et al (1989) Nature 339, 61; Davis et al (1991) Science 253, 59; Lin et al 
(1992) Cell 68, 775; Gearing et al (1989) EMBO 8, 3667; Aruffo and Seed (1987) 
PNAS 84, 8573 and references therein. Generally, COS cells are transfected to 

25 express a cDNA library or PCR product and cells producing peptides/polypeptides 
which bind a semaphorin/receptor peptide/polypeptide are isolated. For 
neurosemaphorin receptors, fetal brain cDNA libraries are preferred; for 
immunosemaphorin receptors, libraries derived from activated lymphoid or myloid 
cell lines or tissue derived from sites of inflammation or delayed-type 

30 hypersensitivity are preferred; and for semaphorin and semaphorin receptor 

variants used by tumor cells to evade immune survailance or suppress an immune 
response (oncosemaphorins), libraries derived from cancerous tissue or tumor cell 
lines resistant to the host immune system are preferred. Alternatively, PCR 
primers based upon known semahorin/receptor sequences such as those disclosed 

35 herein are used to amplify PCR product from such tissues/cells. Other 
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receptor/ligar^^olation methods using immobilized^pLi or antibody are known 
to those skilled in the art. 

Semaphorin receptor peptides with receptor binding specificity are identified 
by a variety of ways including having conserved consensus sequences with other 
5 semaphorin receptors, by crosslinking to ligand or receptor-specific antibody, or 
preferably, by screening such peptides for semaphorin binding or disruption of 
semaphorin-receptor binding. Methods for identifying semaphorin receptor 
peptides with the requisite binding activity are described herein or otherwise known 
to those skilled in the art. By analogous methods, semaphorin receptor peptides 
10 are used to define additional semaphorin peptides with semaphorin binding 
specificity, particularly receptor specificity. 

The various semaphorin and semaphorin receptor peptides are used to 
define functional domains of semaphorins, identify compounds that associate with 
semaphorins, design compounds capable of modulating semaphorin-mediated nerve 
15 and immune cell function, and define additional semaphorin and semaphorin 
receptor-specific binding agents. For example, semaphorin mutants, including 
deletion mutants are generated from the disclosed semaphorin sequences and used 
to identify regions important for specific protein-ligand or protein-protein 
interactions, for example, by assaying for the ability to mediate repulsion or 
preclude aggregation in cell-based assays as described herein. Further, x-ray 
crystallographic data of the disclosed protein are used to rationally design binding 
molecules of determined structure or complementarity for modulating growth cone 
growth and guidance. 

Additional semaphorin- and receptor-specific agents include specific 
25 antibodies that can be modified to a monovalent form, such as Fab, Fab', or Fv, 
specifically binding oligopeptides or oligonucleotides and most preferably, small 
molecular weight organic receptor antagonists. For example, the disclosed 
semaphorin and receptor peptides are used as immunogens to generate semaphorin- 
and receptor-specific polyclonal or monoclonal antibodies. See, Harlow and Lane 
30 (1988) Antibodies, A Laboratory Manual, Cold Spring Harbor Laboratory, for 
general methods. Anti-idiotypic antibody, especially internal imaging anti-ids are 
also prepared using the disclosures herein. 



20 
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In additionfl^emaphorin and semaphorin-recepto^Rived polypeptides and 
peptides, other prospective agents are screened from large libraries of synthetic or 
natural compounds. For example, numerous means are available for random and 
directed synthesis of saccharide, peptide, and nucleic acid based compounds. 
5 Alternatively, libraries of natural compounds in the form of bacterial, fungal, plant 
and animal extracts are available or readily producible. Additionally, natural and 
synthetically produced libraries and compounds are readily modified through 
conventional chemical, physical, and biochemical means. See, e.g. Houghten et 
al. and Lam et al (1991) Nature 354, 84 and 81, respectively and Blake and Litzi- 
10 Davis (1992), Bioconjugate Chem 3, 510. 

Useful agents are identified with a range of assays employing a compound 
comprising the subject peptides or encoding nucleic acids. A wide variety of in 
vitro, cell-free binding assays, especially assays for specific binding to immobilized 
compounds comprising semaphorin or semaphorin receptor peptide find convenient 
15 use. While less preferred, cell-based assays may be used to determine specific 
effects of prospective agents on semaphorin-receptor binding may be assayed, see, 
e.g. Schnell and Schwab (1990) supra. Optionally, the intracellular C-terminal 
domain is substituted with a sequence encoding a oligopeptide or polypeptide 
domain that provides a detectable intracellular signal upon ligand binding different 
20 from the natural receptor. Useful intracellular domains include those of the human 
insulin receptor and the TCR, especially domains with kinase activity and domains 
capable of triggering calcium influx which is conveniently detected by fluorimetry 
by preloading the host cells with Fura-2. More preferred assays involve simple 
cell-free in vitro binding of candidate agents to immobilized semaphorin or 
25 receptor peptides, or vice versa. See, e.g. Fodor et al (1991) Science 251, 767 for 
light directed parallel synthesis method. Such assays are amenable to scale-up, 
high throughput usage suitable for volume drug screening. 

Useful agents are typically those that bind to a semaphorin or disrupt the 
association of a semaphorin with its receptor. Preferred agents are semaphorin- 
30 specific and do not cross react with other neural or lymphoid cell membrane 

proteins. Useful agents may be found within numerous chemical classes, though 
typically they are organic compounds; preferably small organic compounds. Small 
organic compounds have a molecular weight of more than 150 yet less than about 
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4,500, prefen^^ less than about 1500, more prefera^ff less than about 500. 
Exemplary classes include peptides, saccharides, steroids, heterocyclics, 
polycyclics, substituted aromatic compounds, and the like. 

Selected agents may be modified to enhance efficacy, stability, 
5 pharmaceutical compatibility, and the like. Structural identification of an agent 
may be used to identify, generate, or screen additional agents. For example, 
where peptide agents are identified, they may be modified in a variety of ways as 
described above, e.g. to enhance their proteolytic stability. Other methods of 
stabilization may include encapsulation, for example, in liposomes, etc. 
10 The subject binding agents may be prepared in a variety of ways known to 

those skilled in the art. For example, peptides under about 60 amino acids can be 
readily synthesized today using conventional commercially available automatic 
synthesizers. Alternatively, DNA sequences may be prepared encoding the desired 
peptide and inserted into an appropriate expression vector for expression in a 
15 prokaryotic or eukaryotic host. A wide variety of expression vectors are available 
today and may be used in conventional ways for transformation of a competent 
host for expression and isolation. If desired, the open reading frame encoding the 
desired peptide may be joined to a signal sequence for secretion, so as to permit 
isolation from the culture medium. Methods for preparing the desired sequence, 
20 inserting the sequence into an expression vector, transforming a competent host, 
and growing the host in culture for production of the product may be found in U.S. 
Patent Nos. 4,710,473, 4,711,843 and 4,713,339. 

For therapeutic uses, the compositions and agents disclosed herein may be 
administered by any convenient way. Small organics are preferably administered 
25 orally; large molecular weight (e.g. greater than 1 kD, usually greater than 3 kD, 
more usually greater than 10 kD) compositions and agents are preferably 
administered parenterally, conveniently in a pharmaceutical^ or physiologically 
acceptable carrier, e.g., phosphate buffered saline, saline, deionized water, or the 
like. Typically, the compositions are added to a retained physiological fluid such 
30 as blood or synovial fluid. For CNS administration, a variety of techniques are 
available for promoting transfer of the therapeutic across the blood brain barrier 
including disruption by surgery or injection, drugs which transciently open 
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adhesion contact SLen CNS vasculature endothelial ce^Pand compounds which 
fascilitate translocation through such cells. 

As examples, many of the disclosed therapeutics are amenable to directly 
injected or infused, topical, intratracheal/nasal administration, e.g. through aerosal, 
5 intraocularly, or within/on implants e.g. fibers (e.g. collagen) osmotic pumps, 
grafts comprising appropriately transformed cells, etc. A particularly useful 
application involves coating, imbedding or derivatizing fibers, such as collagen 
fibers, protein polymers, etc. with therapuetic peptides. Other useful approaches 
are described in Otto et al. (1989) J Neuroscience Research 22, 83-91 and Otto and 

10 Unsicker (1990) J Neuroscience 10, 1912-1921. Generally, the amount 

administered will be empirically determined, typically in the range of about 10 to 
1000 Mg/kg of the recipient. For peptide agents, the concentration will generally 
be in the range of about 50 to 500 /xg/ml in the dose administered. Other additives 
may be included, such as stabilizers, bactericides, etc. These additives will be 

15 present in conventional amounts. 

The invention provides isolated nucleic acid sequences encoding the 
disclosed semaphorin and semaphorin receptor peptides and polypeptides, including 
sequences substantially identical to sequences encoding such polypeptides. An 
"isolated" nucleic acid sequence is present as other than a naturally occurring 

20 chromosome or transcript in its natural state and typically is removed from at least 
some of the nucleotide sequences with which it is normally associated with on a 
natural chromosome. A complementary sequence hybridizes to a unique portion of 
the disclosed semaphorin sequence under low stringency conditions, for example, 
at 50° C and SSC (0.9 M saline/0.09 M sodium citrate) and that remains bound 

25 when subject to washing at 55 °C with SSC. Regions of non-identity of 

complementary nucleic acids are preferably or in the case of homologous nucleic 
acids, a nucleotide change providing a redundant codon. A partially pure 
nucleotide sequence constitutes at least about 5%, preferably at least about 30%, 
and more preferably at least about 90% by weight of total nucleic acid present in a 

30 given fraction. 

Unique portions of the disclosed nucleic acid sequence are of length 
sufficient to distinguish previously known nucleic acid sequences. Thus, a unique 
portion has a nucleotide sequence at least long enough to define a novel 
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oligonucleotide. Preferred nucleic acid portions en<^ro a unique semaphorin 
peptide. The nucleic acids of the invention and portions thereof, other than those 
used as PGR primers, are usually at least about 60 bp and usually less than about 
60 kb in length. PCR primers are generally between about 15 and 100 nucleotides 
in length. 

Nucleotide (cDNA) sequences encoding several full length semaphorins are 
disclosed in Figs. 1-8. The invention also provides for the disclosed sequences 
modified by transitions, transversions, deletions, insertions, or other modifications 
such as alternative splicing and also provides for genomic semaphorin sequences, 
and gene flanking sequences, including regulatory sequences; included are DNA 
and RNA sequences, sense and antisense. Preferred DNA sequence portions 
include portions encoding the preferred amino acid sequence portions disclosed 
above. For antisense applications where the inhibition of semaphorin expression is 
indicated, especially useful oligonucleotides are between about 10 and 30 
nucleotides in length and include sequences surrounding the disclosed ATG start 
site, especially the oligonucleotides defined by the disclosed sequence beginning 
about 5 nucleotides before the start site and ending about 10 nucleotides after the 
disclosed start site. Other especially useful semaphorin mutants involve deletion or 
substitution modifications of the disclosed cytoplasmic C-termini of transmembrane 
semaphorins. Accordingly, semaphorin mutants with semaphorin binding affinities 
but with altered intracellular signal transduction capacities are produced. 

For modified semaphorin-encoding sequences or related sequences encoding 
proteins with semaphorin-like functions, there will generally be substantial 
sequence identity between at least a segment thereof and a segment encoding at 
least a portion of the disclosed semaphorin sequence, preferably at least about 
60%, more preferably at least 80%, most preferably at least 90% identity. 
Homologous segments are particularly within semaphorin domain-encoding regions 
and regions encoding protein domains involved in protein-protein, particularly 
semaphorin-receptor interactions and differences within such segments are 
particularly conservative substitutions. 

Typically, the invention's semaphorin peptide encoding polynucleotides are 
associated with heterologous sequences. Examples of such heterologous sequences 
include regulatory sequences such as promoters, enhancers, response elements, 
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signal sequences, pfjfdenylation sequences, etc., introns,^^nd 3' noncoding 
regions, etc. Other useful heterologous sequences are known to those skilled in the 
art or otherwise disclosed references cited herein. According to a particular 
embodiment of the invention, portions of the serhaphorin encoding sequence are 
5 spliced with heterologous sequences to produce soluble, secreted fusion proteins, 
using appropriate signal sequences and optionally, a fusion partner such as /3-Gal. 

The disclosed sequences are also used to identify and isolate other natural 
semaphores and analogs. In particular, the disclosed nucleic acid sequences are 
used as hybridization probes under low-stringency or PCR primers, e.g. 
10 oligonucleotides encoding functional semaphorin domains are 32 P-labeled and used 
to screen XcDNA libraries at low stringency to identify similar cDNAs that encode 
proteins with related functional domains. Additionally, nucleic acids encoding at 
least a portion of the disclosed semaphorin are used to characterize tissue specific 
expression of semaphorin as well as changes of expression over time, particularly 
15 during organismal development or cellular differentiation. 

The semaphorin encoding nucleic acids can be subject to alternative 
purification, synthesis, modification, sequencing, expression, transfection, 
administration or other use by methods disclosed in standard manuals such as 
Molecular Cloning, A Laboratory Manual (2nd Ed., Sambrook, Fritsch and 
20 Maniatis, Cold Spring Harbor), Current Protocols in Molecular Biology (Eds. 

Aufubel, Brent, Kingston, More, Feidman, Smith and Stuhl, Greene Publ. Assoc., 
Wiley-Interscience, NY, NY, 1992) or that are otherwise known in the art. For 
example, the nucleic acids can be modified to alter stability, solubility, binding 
affinity and specificity, etc. semaphorin-encoding sequences can be selectively 
25 methylated, etc. The nucleic acid sequences of the present invention may also be 
modified with a label capable of providing a detectable signal, either directly or 
indirectly. Exemplary labels include radioisotopes, fluorescers, biotinylation, etc. 

The invention also provides vectors comprising nucleic acids encoding 
semaphorin peptides, polypeptides or analogs. A large number of vectors, 
30 including plasmid and viral vectors, have been described for expression in a variety 
of eukaryotic and prokaryotic hosts. Advantageously, vectors may also include a 
promoter operably linked to the semaphorin-encoding portion. Vectors will often 
include one or more replication systems for cloning or expression, one or more 
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markers for se^ction in the host, e.g. antibiotic resisllR. The inserted 
semaphorin coding sequences may be synthesized, isolated from natural sources, 
prepared as hybrids, etc. Suitable host cells may be 

transformed/transfected/infected by any suitable method including electroporation, 
5 CaCl 2 mediated DNA uptake, viral infection, microinjection, microprojectile, or 
other methods. 

Appropriate host cells include bacteria, archebacteria, fungi, especially 
yeast, and plant and animal cells, especially mammalian cells. Of particular 
interest are E. coli, B. subtilis. Saccharomvces cerevisiae . SF9 cells, C129 cells, 

10 293 cells, Neurospora, and CHO, COS, HeLa cells, immortalized mammalian 
myeloid and lymphoid cell lines, and pluripotent cells, especially mammalian ES 
cells and zygotes. Preferred replication systems include M13, ColEl, SV40, 
baculovirus, lambda, adenovirus, AAV, BPV, etc. A large number of 
transcription initiation and termination regulatory regions have been isolated and 

15 shown to be effective in the transcription and translation of heterologous proteins in 
the various hosts. Examples of these regions, methods of isolation, manner of 
manipulation, etc. are known in the art. Under appropriate expression conditions, 
host cells can be used as a source of recombinantly produced semaphorins or 
analogs. 

20 For the production of stably transformed cells and transgenic animals, 

nucleic acids encoding the disclosed semaphorins may be integrated into a host 
genome by recombination events. For example, such a sequence can be 
microinjected into a cell, and thereby effect homologous recombination at the site 
of an endogenous gene, an analog or pseudogene thereof, or a sequence with 

25 substantial identity to an semaphorin-encoding gene. Other recombination-based 
methods such as nonhomologous recombinations, deletion of endogenous gene by 
homologous recombination, especially in pluripotent cells, etc., provide additional 
applications. Preferred transgenics and stable transformants over-express the 
disclosed receptor gene and find use in drug development and as a disease model. 

30 Alternatively, knock-out cells and animals find use in development and functional 
studies. Methods for making transgenic animals, usually rodents, from ES cells or 
zygotes are known to those skilled in the art. 
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The comp JRns and methods disclosed herein imP>e used to effect gene 
therapy. See, e.g. Zhu et al. (1993) Science 261, 209-211; Gutierrez et al. (1992) 
Lancet 339, 715-721. For example, cells are transfected with semaphorin 
sequences operably linked to gene regulatory sequences capable of effecting altered 

5 semaphorin expression or regulation. To modulate semaphorin translation, cells 
may be transfected with complementary antisense polynucleotides. For gene 
therapy involving the transfusion of semaphorin transfected cells, administration 
will depend on a number of variables that are ascertained empirically. For 
example, the number of cells will vary depending on the stability of the transfused 

10 cells. Transfusion media is typically a buffered saline solution or other 

pharmacologically acceptable solution. Similarly the amount of other administered 
compositions, e.g. transfected nucleic acid, protein, etc., will depend on the 
manner of administration, purpose of the therapy, and the like. 

The following examples are offered by way of illustration and not by way 

15 of limitation. 

EXAMPLES 

I. Isolation and characterization of G r asshopper Semaphorin T (SEQ ID 
TsTOs-57 and 58) (previously referred to as Fasciclin IV> 

20 In order to identify cell surface molecules that function in selective 

fasciculation, a series of monoclonal antibody (MAb) screens was conducted. The 
immunogen used for most of these screens was membranes from the longitudinal 
connectives (the collection of longitudinal axons) between adjacent segmental 
ganglia of the nervous system of the larval grasshopper. From these screens, MAb 

25 3B11 and 8C6 were used to purify and characterize two surface glycoproteins, 

fasciclin I and fasciclin II, see, Bastiani et al., 1987; the genes encoding both were 
subsequently cloned, see, Snow et al. 1989, Zinn et al. 1988, and Harrelson and 
Goodman, 1988. 

Another MAb isolated during these screens, MAb 6F8, was chosen for the 
30 present study because, just as with fasciclin I and fasciclin II, the antigen 

recognized by this MAb is expressed on a different but overlapping subset of axon 
pathways in the developing CNS. The 6F8 antigen appears to be localized on the 
outside of cell surfaces, as indicated by MAb binding when incubated both in live 

21 
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preparations, and in fixed preparations in which no alRrgents have been added. 
Because the 6F8 antigen is a surface glycoprotein expressed on a subset of axon 
fascicles (see below), we call it fasciclin IV. 

Fasciclin IV expression begins early in embryonic development before 
5 axonogenesis. At 29% of development, expression is seen on the surface of the 
midline mesectodermal cells and around 5-7 neuroblasts and associated ectodermal 
cells per hemisegment. This expression is reminiscent of the mesectodermal and 
neuroblast-associated expression observed with both fasciclin I and fasciclin II; 
however, in each case, the pattern resolves into a different subset of neuroblasts 
10 and associated ectodermal cells. 

At 32% of development, shortly after the onset of axonogenesis in the CNS, 
fasciclin IV expression is seen on the surface of the axons and cell bodies of the 
three pairs of MP4, MPS, and MP6 midline progeny, the three U motoneurons, 
and on several unidentified neurons in close proximity to the U's. This is in 
15 contrast to fasciclin II, which at this stage is expressed on the MP1 and dMP2 
neurons, and fasciclin I, which is expressed on the U neurons but not on any 
midline precursor progeny. 

The expression of fasciclin IV on a subset of axon pathways is best 
observed around 40% of development, after the establishment of the first 
20 longitudinal and commissural axon pathways . At this stage, the protein is 
expressed on two longitudinal axon fascicles, a subset of commissural axon 
fascicles, a tract extending anteriorly along the midline, and a subset of fascicles in 
the segmental nerve (SN) and intersegmental nerve (ISN) roots. 

Specifically, fasciclin IV is expressed on the U fascicle, a longitudinal 
25 pathway (between adjacent segmental neuromeres) pioneered in part by the U 

neurons, and on the A/P longitudinal fascicle (in part an extension of the U fascicle 
within each segmental neuromere. In addition, fasciclin IV is also expressed on a 
second narrower, medial, and more ventral longitudinal pathway. The U axons 
turn and exit the CNS as they pioneer the ISN; the U's and many other axons 
30 within the ISN express fasciclin IV. The continuation of the U fascicle posterior to 
the ISN junction is also fasciclin IV-positive. The specificity of fasciclin IV for 
distinct subsets of longitudinal pathways can be seen by comparing fasciclin IV and 
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fasciclin II expres J^in the same embryo; fasciclin IV is expressed on the U and 
A/P pathways whereas fasciclin II is expressed on the MP1 pathway. 

The axons in the median fiber tract (MFT) also express fasciclin IV. The 
MFT is pioneered by the three pairs of progeny of the midline precursors MP4, 
5 MP5, and MP6. The MFT actually contains three separate fascicles. The axons 
of the two MP4 progeny pioneer the dorsal MFT fascicle and then bifurcate at the 
posterior end of the anterior commissure; whereas the axons of the two MP6 
progeny pioneer the ventral MFT fascicle and then bifurcate at the anterior end of 
the posterior commissure. Fasciclin IV is expressed on the cell bodies of the six 
10 MP4, MP5, and MP6 neurons, and on their growth cones and axons as they extend 
anteriorly in the MFT and bifurcate in one of the two commissures. However, 
this expression is regional in that once these axons bifurcate and begin to extend 
laterally across the longitudinal pathways and towards the peripheral nerve roots, 
their expression of fasciclin IV greatly decreases. Thus, fasciclin IV is a label for 
15 the axons in the MFT and their initial bifurcations in both the anterior and 

posterior commissures. It appears to be expressed on other commissural fascicles 
as well. However, the commissural expression of fasciclin IV is distinct from the 
transient expression of fasciclin II along the posterior edge of the posterior 
commissure, or the expression of fasciclin I on several different commissural axon 
20 fascicles in both the anterior and posterior commissure (Bastiani et al., 1987; 
Harrelson and Goodman, 1988). 

Fasciclin IV is also expressed on a subset of motor axons exiting the CNS 
in the SN. The SN splits into two major branches, one anterior and the other 
posterior, as it exits the CNS. Two large bundles of motoneuron axons in the 
25 anterior branch express fasciclin IV at high levels; one narrow bundle of 

motoneuron axons in the posterior branch expresses the protein at much lower 
levels. Fasciclin IV is also expressed on many of the axons in the ISN. 

The CNS and nerve root expression patterns of fasciclin IV, fasciclin I, and 
fasciclin II at around 40% of embryonic development idicate that although there is 
30 some overlap in their patterns (e.g., both fasciclin IV and fasciclin I label the U 
axons), these three surface glycoproteins label distinct subsets of axon pathways in 
the developing CNS. 
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Fasciclin IV is expressed on epithelial bands in the^^eloping limb bud 

Fasciclin IV is expressed on the developing limb bud epithelium in 
circumferential bands; at 34.5% of development these bands can be localized with 
respect to constrictions in the epithelium that mark presumptive segment 
5 boundaries. In addition to a band just distal to the trochanter/coxa segment 

boundary, bands are also found in the tibia, femur, coxa, and later in development 
a fifth band is found in the tarsus. Fasciclin IV is also expressed in the nascent 
chordotonal organ in the dorsal aspect of the femur. The bands in the tibia, 
trochanter, and coxa completely encircle the limb. However, the femoral band is 
10 incomplete, containing a gap on the anterior epithelia of this segment. 

The position of the Til axon pathway with respect to these bands of 
fasciclin IV-positive epithelia suggests a potential role for fasciclin IV in guiding 
the Til growth cones. First, the band of fasciclin IV expression in the trochanter, 
which is approximately three epithelial cell diameters in width when encountered 
15 by the Til growth cones, is the axial location where the growth cones reorient 

from proximal migration to circumferential branch extension. The Trl cell, which 
marks the location of the turn, lies within this band, usually over the central or the 
proximal cell tier. Secondly, although there is a more distal fasciclin IV 
expressing band in the femur, where a change in Til growth is not observed, there 
20 exists a gap in this band such that fasciclin IV expressing cells are not traversed by 
the Til growth cones. The Til axons also may encounter a fasciclin IV expressing 
region within the coxa, where interactions between the growth cones, the epithelial 
cells, and the Cxi guidepost cells have not yet been investigated. 

In addition to its expression over the surface of bands of epithelial cells, 
25 fasciclin IV protein, as visualized with MAb 6F8, is also found on the basal 

surface of these cells in a punctate pattern. This punctate staining is not an artifact 
of the HRP immunocytochemistry since fluorescent visualization of MAb 6F8 is 
also punctate. The non-neuronal expression of fasciclin IV is not restricted to limb 
buds. Circumferential epithelial bands of fasciclin IV expression are also seen on 
30 subesophageal mandibular structures and on the developing antennae. 
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MAb directed ag IK fasciclin IV can alter the formatislTof the Til axon 
pathway in the limb bud 

The expression of fasciclin IV on an epithelial band at a key choice point in 
the formation of the Til axon pathway led us to ask whether this protein is 
5 involved in growth cone guidance at this location. To answer this question, we 
cultured embryos, or epithelial fillets (e. g., O'Connor et al., 1990), during the 
5% of development necessary for normal pathway formation, either in the presence 
or absence of MAb 6F8 or 6F8 Fab fragments. Under the culture conditions used 
for these experiments, defective Til pathways are observed in 14% of limbs 
10 (Chang et al., 1992); this defines the baseline of abnormalities observed using 
these conditions. For controls we used other MAbs and their Fab fragments that 
either bind to the surfaces of these neurons and epithelial cells (MAb 3B11 against 
the surface protein fasciclin I) or do not (MAb 4D9 against the nuclear protein 
engrailed; Patel et al., 1989). To assess the impact of MAb 6F8 on Til pathway 
15 formation, we compared the percentage of aberrant pathways observed following 
treatment with MAb 6F8 to that observed with MAbs 3B11 and 4D9. Our cultures 
began at 32% of development when the Til growth cones have not yet reached the 
epithelium just distal to the trochanter/coxa boundary and therefore have not 
encountered epithelial cells expressing fasciclin IV. Following approximately 30 
20 hours in culture (-4% of development), embryos were fixed and immunostained 
with antibodies to HRP in order to visualize the Til axons and other neurons in the 
limb bud. Criteria for scoring the Til pathway, and the definition of "aberrant", 
are described in detail in the Experimental Procedures. 

Although MAb 6F8 does not arrest pathway formation, several types of 
25 distinctive, abnormal pathways are observed. These defects generally begin where 
growth cones first contact the fasciclin IV expressing cells in the trochanter. 
Normally, the Til neurons each have a single axon, and the axons of the two cells 
are fasciculated in that portion of the pathway within the trochanter. Following 
treatment with MAb 6F8, multiple long axon branches are observed within, and 
30 proximal to, the trochanter. Two major classes of pathways are taken by these 
branches; in 36% of aberrant limbs, multiple, long axon branches extend ventraUy 
in the region distal to the Cxi cells which contains the band of fasciclin IV 
expressing epithelial cells. In the ventral region of the trochanter, these branches 
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often indepen^Sly turn proximally to contact the C^fclls, and thus complete the 
pathway in this region. 

In the second major class of pathway defect, seen in 47% of aberrant limbs, 
axon branches leave the trochanter at abnormal, dorsal locations, and extend 
5 proximally across the trochanter/coxa boundary. These axons then veer ventrally, 
often contacting the Cxi neurons. The remaining 17% of defects include 
defasciculation distal to the trochanter, axon branches that fail to turn proximally in 
the ventral trochanter and continue into the posterior compartment of the limb, and 
axon branches which cross the trochanter/coxa boundary and continue to extend 
10 proximally without a ventral turn. 

When cultured in the presence of MAb 6F8, 43% of limbs exhibited 
malformed Til pathways (n = 381) as compared to 11% with MAb 3B11 (n = 
230) and 5% with MAb 4D9 (n = 20). These percentages are pooled from 
treatments with MAbs concentrated from hybridoma supernatant, IgGs isolated 
15 from these supernatants, and Fab fragments isolated from these IgG preparations 
(see Experimental Procedures). The frequency of malformed Til pathways and the 
types of defects observed showed no significant variation regardless of the method 
of antibody preparation or type of antibody used. Since Fabs show similar results 
as IgGs, the effects of MAb 6F8 are not due to cross linking by the bivalent IgG. 
20 In summary, following treatment with MAb 6F8, the Til pathway typically 

exhibits abnormal morphology beginning just distal to the trochanter and at the site 
of fasciclin IV expression. The two most common types of Til pathway defects 
described above occur in 36% of experimental limbs (treated with MAb 6F8), but 
are seen in only 4% of control limbs (treated with MAbs 3 Bll and 4D9). 
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Fasciclin IV cDNAs encode a novel integral membrane protein 

Grasshopper fasciclin IV was purified by passing crude embryonic 
grasshopper lysates over a MAb 6F8 column. After affinity purification, the 
protein was eluted, precipitated, denatured, modified at cysteines, and digested 
with either trypsin or Lys-C. Individual peptides were resolved by reverse phase 
HPLC and microsequenced using standard methods. 

The amino acid sequences derived from these proteolytic fragments were 
used to generate oligonucleotide probes for PCR experiments, resulting in products 
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that were used to llUte cDNA clones* from the Zinn embryonic grasshopper 
cDNA library (Snow et al., 1988). Sequence analysis of these cDNAs reveals a 
single open reading frame (ORF) encoding a protein with two potential 



5 residues and (beginning at amino acid 627) a potential transmembrane domain of 
25 amino acids. Thus, the deduced protein has an extracellular domain of 605 
amino acids, a transmembrane domain, and a cytoplasmic domain of 78 amino 
acids. The calculated molecular mass of the mature fasciclin IV protein is 80 kd 
and is confirmed by Western blot analysis of the affinity purified and endogenous 

10 protein as described below. The extracellular domain of the protein includes 16 
cysteine residues that fall into three loose clusters but do not constitute a repeated 
domain and are not similar to other known motifs with cysteine repeats. There are 
also six potential sites for N-linked glycosylation in the extracellular domain. 
Treatment of affinity purified fasciclin IV with N-Glycanase demonstrates that 

15 fasciclin IV does indeed contain N-linked oligosaccharides. Fasciclin IV shows no 
sequence similarity when compared with other proteins in the PIR data base using 
BLASTP (Altschul et al., 1990), and is therefore a novel type I integral membrane 
protein. 



20 protein encoded by the fasciclin IV cDNA was used to stain grasshopper embryos 
at 40% of development. The observed staining pattern was identical to that seen 
with MAb 6F8. On Western blots, this antiserum recognizes the protein we 
affinity purified using MAb 6F8 and then subjected to microsequence analysis. 
Additionally, the polyclonal serum recognizes a protein of similar molecular mass 

25 from grasshopper embryonic membranes. Taken together these data indicate that 
the sequence we have obtained is indeed fasciclin IV. 

Four other cell surface proteins that label subsets of axon pathways in the 
insect nervous system (fasciclin I, fasciclin II, fasciclin III, and neuroglian) are 
capable of mediating homophilic cell adhesion when transfected into S2 cells in 

30 vitro (Snow et al., 1989; Elkins et al., 1990b; Grenningloh et al., 1990). To ask 
whether fasciclin IV can function as a homophilic cell adhesion molecule, the 
fasciclin IV cDNA with the complete ORF was placed under the control of the 
inducible metallothionein promoter (Bunch et al., 1988), transfected into S2 cells, 



hydrophobic stretches of amino acids: an amino-terminal signal sequence of 20 



A polyclonal antiserum directed against the cytoplasmic domain of the 
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and assayed foffs ability to promote adhesion in norlffly non-adhesive S2 cells. 

Following induction with copper, fasciclin IV was synthesized in these S2 cells as 

shown by Western blot analysis and cell surface staining of induced S2 cells with 

the polyclonal antiserum described above. 
5 We observed no evidence for aggregation upon induction of fasciclin IV 

expression, thus suggesting that, in contrast to the other four proteins, fasciclin IV 
does not function as a homophilic cell adhesion molecule. Alternatively, fasciclin 
IV-mediated aggregation might require some further posttranslational modification, 
or co-factor, not supplied by the S2 cells, but clearly this protein acts differently in 
10 the S2 cell assay than the other four axonal glycoproteins previously tested. This 
is consistent with the pattern of fasciclin IV expression in the embryonic limb since 
only the epithelial cells and not the Til growth cones express fasciclin TV, and yet 
antibody blocking experiments indicate that fasciclin IV functions in the epithelial 
guidance of these growth cones. Such results suggest that fasciclin IV functions in 
15 a heterophils adhesion or signaling system. 

Discussion 

Fasciclin IV is expressed on groups of axons that fasciculate in the CNS, 
suggesting that, much like other insect axonal glycoproteins, it functions as a 

20 homophilic cell adhesion molecule binding these axons together. Yet, in the limb 
bud, fasciclin IV is expressed on a band of epithelium but not on the growth cones 
that reorient along this band, suggesting a heterophilic function. That fasciclin IV 
functions in a heterophilic rather than homophilic fashion is supported by the lack 
of homophilic adhesion in S2 cell aggregation assays. In contrast, fasciclin I, 

25 fasciclin n, fasciclin m, and neuroglian all can function as homophilic cell 

adhesion molecules (Snow et al., 1989; Elkins et al., 1990b; Grenningloh et al., 
1990). 

cDNA sequence analysis indicates that, fasciclin IV is an integral membrane 
protein with a novel sequence not related to any protein in the present data base. 
30 Thus, fasciclin IV represents a new type of protein that functions in the epithelial 
guidance of pioneer growth cones in the developing limb bud. Given its 
expression on a subset of axon pathways in the developing CNS, fasciclin IV 
functions in the guidance of CNS growth cones as well. 
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The resultl^fen the MAb blocking experiments i^rminate several issues in 
Til growth cone guidance and axon morphogenesis in the limb. First, the most 
striking change in growth cone behavior in the limb is the cessation of proximal 
growth and initiation of circumferential extension of processes upon encountering 
5 the trochanter/coxa boundary region (Bentley and Caudy, 1983; Caudy and 
Bentley, 1987). This could be because the band of epithelial cells within the 
trochanter promotes circumferential growth, or because the cells comprising the 
trochanter/coxa boundary and the region just proximal to it are non-permissive or 
aversive for growth cone migration, or both. The extension of many axon 
10 branches across the trochanter/coxa boundary following treatment with MAb 6F8 
suggests that the trochanter/coxa boundary cells, which do not express fasciclin IV, 
are not aversive or non-permissive. Thus the change in behavior at the boundary 
appears to be due to the ability of fasciclin IV expressing epithelial cells to 
promote circumferential extension of processes from the Til growth cones. 
15 Secondly, treatment with MAb 6F8 results in frequent defasciculation of the 

axons of the two Til neurons, and also formation of abnormal multiple axon 
branches, within the trochanter over fasciclin IV-expressing epithelial cells. 
Previous studies have shown that treatment with antibodies against ligands 
expressed on non-neural substrates (Landmesser et al., 1988), or putative 
20 competitive inhibitors of substrate ligands (Wang and Denburg, 1992) can promote 
defasciculation and increased axonal branching. Our results suggest that Til 
axon: axon fasciculation and axon branching also are strongly influenced by 
interactions with substrate ligands, and that fasciclin IV appears to be a component 
of this interaction within the trochanter. 
25 Thirdly, despite the effects of MAb 6F8 on axon branching, and on 

crossing the trochanter/coxa boundary, there remains a pronounced tendency for 
branches to grow ventrally both within the trochanter and within the distal region 
of the coxa. Consequently, all signals which can promote ventral migration of the 
growth cones have not been blocked by MAb 6F8 treatment. Antibody treatment 
30 may have a threshold effect in which ventral growth directing properties of 
fasciclin IV are more robust, and less incapacitated by treatment, than other 
features; alternatively, guidance information promoting ventral migration may be 
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independent o?ftciclin IV. Time lapse video expe^fets to determine how the 
abnormal pathways we observe actually form can resolve these issues. 

These results demonstrate that fasciclin IV functions as a guidance cue for 
the Til growth cones just distal to the trochanter/coxa boundary, is required for 
5 these growth cones to stop proximal growth and spread circumferentially, and that 
the function of fasciclin IV in Til pathway formation result from interactions 
between a receptor/ligand on the Til growth cones and fasciclin IV on the surface 
of the band of epithelial cells results in changes in growth cone morphology and 
subsequent reorientation. Fasciclin IV appears to elicit this change in growth cone 
10 morphology and orientation via regulation of adhesion, a signal transduction 
function, or a combination of the two. 



Experimental Procedures 
Immunocytochemistry 
15 Grasshopper embryos were obtained from a colony maintained at the U.C. 

Berkeley and staged by percentage of total embryonic development (Bentley et al., 
1979). Embryos were dissected in PBS, fixed for 40 min in PEM-FA [0.1 M 
PIPES (pH6.95), 2.0 mM EGTA, 1.0 mM MgS0 4 , 3.7% formaldehyde], washed 
for 1 hr with three changes in PBT (lx PBS, 0.5% Triton X-100, 0.2% BSA), 
20 blocked for 30 min in PBT with 5% normal goat serum, and incubated overnight at 
4°C in primary antibody. PBSap (lx PBS, 0.1% Saponin, o.2% BSA) was used in 
place of PBT with MAb 8G7. Antibody dilutions were as follows: MAb 6F8 1:1, 
polyclonal antisera directed against a fasciclin IV bacterial fusion protein (#98-3) 
1:400; MAb 8G7 1:4; MAb 8C6 1:1. The embryos were washed for one hour in 
25 PBT with three changes, blocked for 30 min, and incubated in secondary antibody 
for at least 2 hr at room temperature. The secondary antibodies were HRP- 
conjugated goat anti-mouse and anti-rat IgG (Jackson Immunoresearch Lab), and 
were diluted 1:300. Embryos were washed in PBT for one hour with three 
changes and then reacted in 0.5% diaminobenzidine (DAB) in PBT. The reaction 
30 was stopped with several washes in PBS and the embryos were cleared in a 

glycerol series (50%, 70%, 90%), mounted and viewed under Nomarski or bright 
field optics. For double-labelled preparations the first HRP reaction was done in 
PBT containing 0.06% NiCl, followed by washing, blocking, and incubation 
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overnight in the jftl primary antibody. The second a^Rdy was visualized 
with a DAB reaction as described above. Embryos cultured in the presence of 
monoclonal antibodies were fixed and incubated overnight in goat anti-HRP 
(Jackson Immunoresearch Labs) conjugated to RITC (Molecular Probes), washed 
5 for one hour in PBT with three changes, mounted in 90% glycerol, 2.5% DABCO 
(Polysciences), and viewed under epifluorescence. S2 cells were stained with 
polyclonal sera #98-3 diluted 1:400 and processed as described previously (Snow et 
al., 1989). 



10 Monoclonal Antibody Blocking Experiments 

In order to test for functional blocking, monoclonal antibody reagents were 
prepared as follows. Hybridoma supernatant was brought to 20% with H 2 0- 
saturated NH^, incubated in ice 1 hr, and spun at 15,000 g at 4°C for 20 min. 
The supernatant was brought to 56% with H 2 0-saturated NH 4 S0 4 , incubated 

15 overnight at 4°C, spun as above. The pellet was resuspended in PBS using 

approximately 1/40 volume of the original hybridoma supernatant (often remaining 
a slurry) and dialyzed against lx PBS overnight at 4°C with two changes. This 
reagent is referred to as "concentrated hybridoma supernatant." Purified IgG was 
obtained by using Immunopure Plus Immobilized Protein A IgG Purification Kit 

20 (Pierce) to isolate IgG from the concentrated hybridoma supernatant. Fab 

fragments were obtained using the ImmunoPure Fab Preparation Kit (Pierce) from 
the previously isolated IgGs. For blocking experiments each reagent was diluted 
into freshly made supplemented RPMI culture media (O'Connor et al., 1990) and 
dialyzed overnight at 4°C against 10 volumes of the same culture media. Dilutions 

25 were as follows: concentrated hybridoma supernatant 1:4; purified IgG 150mg/ml; 
Fab 75mg/ml. 

Embryos for culture experiments were carefully staged to between 31 and 
32% of development. As embryos in each clutch typically differ by less that 1 % 
of embryonic development from each other, the growth cones of the Til neurons at 
30 the beginning of the culture period were located approximately in the mid-femur, 
well distal to the trochanter/coxa segment boundary. From each clutch at least two 
limbs were filleted and the Til neurons labelled with the lipophillic dye Di I 
(Molecular Probes) as described (O'Connor et al., 1990) in order to confirm the 
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precise locati^^f the Til growth cones. Prior to coifing, embryos were 
sterilized and dissected (Chang et al., 1992). The entire amnion and dorsal 
membrane was removed from the embryo to insure access of the reagents during 
culturing. Embryos were randomly divided into groups and cultured in one of the 
5 blocking reagents described above. Cultures were incubated with occasional 

agitation at 30°C for 30 hrs. At the end of the culture period embryos were fixed 
and processed for analysis as described above in immunocytochemistry. 

For each culture experiment, the scoring of the Til pathway in each limb 
was confirmed independently by a second observer. There was no statistically 
10 significant variation between the two observers. Limbs from MAb cultured 

embryos were compared to representative normal limbs from non-MAb cultured 
embryos and were scored as abnormal if any major deviation from the normal Til 
pathway was observed. The Til pathway was scored as abnormal for one or more 
of the following observed characteristics: (1) defasciculation for a minimum 
15 distance of approximately 25 mm anywhere along the pathway, (2) multiple axon 
branches that extended ventrally within the trochanter, (3) presence of one or 
more axon branches that crossed the trochanter/coxa boundary dorsal to the Cxi 
cells, but then turned ventrally in the coxa and contacted the Cxi cells, (4) the 
presence of axon branches that crossed the trochanter/coxa segment boundary, did 
20 not turn ventrally, but continued proximally toward the CNS, and (5) failure of 
ventrally extended axons within the trochanter to contact and reorient proximally to 
the Cxi cells. For each MAb tested, the data are presented as a percentage of the 
abnormal Til pathways observed. The raw data are presented in Table 1. 



25 Protein Affinity Purification and Microsequencing 

Grasshopper fasciclin IV was purified by passing crude embryonic 
grasshopper lysate (Bastiani et al., 1987) over an Affi-Gel 15 column (Bio Rad) 
conjugated with the monoclonal antibody 6F8. Protein was eluted with 50 mM 
DEA (pH 11.5), 0.1% Lauryldimethylamine oxide (Cal Bio Chem), and ImM 
30 EDTA. Protein was then precipitated, denatured, modified at cysteines, and 

digested with either trypsin or Lys-C (Boehringer-Mannheim). Individual peptides 
were resolved by RP-HPLC and microsequenced (Applied Biosy stems 4771 
Microsequencer) using standard chemistry. 
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PCR Methods 



DNA complementary to poly(A) + RNA from 45%-50% grasshopper 
embryos was prepared (Sambrook et al., 1989). PCR was performed using Perkin 
Elmer Taq polymerase (Saiki et al., 1988), and partially degenerate (based on 

5 grasshopper codon bias) oligonucleotides in both orientations corresponding to a 
portion of the protein sequence of several fasciclin IV peptides as determined by 
microsequencing. These oligonucleotides were designed so as not to include all of 
the peptide-derived DNA sequence, leaving a remaining 9-12 base pairs that could 
be used to confirm the correct identity of amplified products. All possible 

10 combinations of these sequences were tried. 40 cycles were performed, the 

parameters of each cycle as follows: 96°C for one min; a sequentially decreasing 
annealing temperature (2°C/cycle, starting at 65°C and ending at 55°C for 
remaining 35 cycles) for 1 min; and at 72°C for one min. Reaction products were 
cloned into the Sma site of M13 mplO and sequenced. Two products, 1074 bp and 

15 288 bp in length, contained DNA 3' to the oligonucleotide sequences encoded the 
additional amino acid sequence of the fasciclin IV peptide from which the 
oligonuceotides were derived. These two fragments have one end in common, and 
the oligonucleotides used to amplify them correspond to the amino acid sequences 
MYVQFGEE and MDEAVPAF (fasciclin IV residue 29-386), and HTLMDEA and 

20 KNYVVRMDG (fasciclin IV residue 376-472). 

cDNA Isolation and Sequence Analysis 

Both PCR products were used to screen 1 X 10 6 clones from a grasshopper 
embryonic cDNA library (Snow et al., 1988). 21 clones that hybridized to both 

25 fragments were recovered, and one 2600 bp clone was sequenced using the 
dideoxy chain termination method (Sanger et al., 1977) and Sequenase (US 
Biochemical Corp.). Templates were made from M13 mplO vectors containing 
inserts generated by sonication of plasmid clones. One cDNA was completely 
sequenced on both strands using Oligonucleotides and double strand sequencing of 

30 plasmid DNA (Sambrook et al., 1989) to fill gaps. Two additional cDNAs were 
analyzed by double strand sequencing to obtain the 3' 402 bp of the transcript. All 
three cDNAs were used to construct a plasmid containing the entire transcript. 
The complete transcript sequence is 2860 bp in length with 452 bp of 5' and 217 
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bp of 3' untJuSited sequences containing stop codo^fe all reading frames. The 
predicted protein sequence was analyzed using the FASTDB and BLASTP 
programs (Intelligenetics). The fasciclin IV ORF unambiguously contains 10 of the 
11 peptide sequences determined by microsequencing the fasciclin IV trypsin and 
5 Lys-C peptides. 



Generation of Polyclonal Antibodies From Bacterial Fusion Proteins 

Bacterial trpE fusion proteins were constructed using pATH (Koerner et al., 
1991) vectors, three restriction fragments encoding extracellular sequences, and 

10 one fragment (770 bp Hindin/Eco Rl, which includes amino acids 476-730) 

encoding both extracellular and intracellular sequences (designated #98-3). Fusion 
proteins were isolated by making an extract of purified inclusion bodies (Spindler 
et al., 1984), and rats were immunized with ~70mg of protein emulsified in RIBI 
adjuvant (Immunochem Research). Rats were injected at two week intervals and 

15 serum was collected 7 days following each injection. Sera were tested 

histologically on grasshopper embryos at 45% of development. Construct #98-3 
showed a strong response and exhibited a staining pattern identical to that of MAb 
6F8. Two of the extracellular constructs responded weakly but also showed the 
fasciclin IV staining pattern. All pre-immune sera failed to stain grasshopper 
20 embryos. 



S2 Cell Transfections, Aggregation Assays, and Western Analysis 

A restriction fragment containing the full length fasciclin IV cDNA was 
cloned into pRmHa-3 (Bunch et al, 1988) and co-transformed into Drosophila S2 
cells (Schneider, 1972) with the plasmid pPC4 (Jokerst et al., 1989), which confers 
a-amanitin resistance. S2 cells were transformed using the Lipofectin Reagent and 
recommended protocol (BRL) with minor modifications. All other S2 cell 
manipulations are essentially as described (Snow et al.,1989), including adhesion 
assays. Fasciclin IV expression in transformed cell lines was induced for adhesion 
assays and histology by adding CuS0 4 to 0.7 mM and incubating for at least 48 
hrs. Northern analysis confirmed transcription of fasciclin IV and surface- 
associated staining of the S2 cells with polyclonal serum #98-3 strongly suggests 
fasciclin IV is being transported to the cell surface. Preparation of membranes 
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from S2 cells and'ffe grasshopper embryos, PAGE, anWestern blot were 
performed as previously described (Elkins et al., 1990b) except that signal was 
detected using the enhanced chemiluminescence immunodetection system kit 
(Amersham). Amount of protein per lane in each sample loaded: fasciclin IV 
5 protein, ~5 ng; S2 cell membranes, 40 mg; grasshopper membranes 80 mg. 

Amounts of protein loaded were verified by Ponceau S staining of the blot prior to 
incubation with the antibody. 
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Genbank Accession Number: 
5 The accession number for the sequence reported in this paper is L00709. 

n. Isolation and characte rization of Tribolium (SEP ID NOs: 63 and 64) and 
Drosophila (SEP ID NQs: 59 and 6(tt Semaphorin I. Drosophila Semaphorin IT, 
(SEP ID NPs:6 1 and 621 Human Semaphorin in fSEP ID NPs: 53 and 541 and 
!0 Vaccinia Virus Semaph orin IV rSEP ID NPs: 55 and 561 and Variola Major 
(smallnoxl Virus Semaphorin IV (SEP TP NPs: 65 and 66V 

We used our G-Semaphorin I cDNA in standard low stringency screening 
methods (of both cDNA and genomic libraries) in an attempt to isolate a potential 
15 Semaphorin I homologue from Drosophila. We were unsuccessful in these 

screens. Since the sequence was novel and shared no similarity to anything else in 
the data base, we then attempted to see if we could identify a Semaphorin I 
homologue in other, more closely related insects. If possible, we would then 
compare these sequences to find the most conserved regions, and then to use 

20 probes (i.e., oligonucleotide primers for PCR) based on these conserved regions to 
find a Drosophila homologue. 

In the process, we used the G-Semaphorin I cDNA in low stringency 
screens to clone Semaphorin I cDNAs from libraries made from locust Locusta 
migratoria embryonic RNA and from a cDNA embryonic library from the cricket 

25 Acheta domestica. We used PCR to clone genomic fragments from genomic DNA 
in the beetle Tribolium, and from the moth Manduca. We then used the Tribolium 
genomic DNA fragment to isolate cDNA clones and ultimately sequenced the 
complete GRF for the Tribolium cDNA. 

In the meantime, we used the partial Tribolium and Manduca sequences in 

30 combination with the complete grasshopper sequence to identify conserved regions 
that allowed us to design primers for PCR in an attempt to clone a Drosophila 
Semaphorin I homologue. Several pairs of primers generated several different 
bands, which were subcloned and sequenced and several of the bands gave partial 
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sequences of the Msophila Semaphorin homologue. One of the bands gave a 
partial sequence of what was clearly a different, more divergent gene, which we 
call D-Semaphorin II. 

Based on the sequence of PCR products, we knew we had identified two 
5 different Drosophila genes, one of which appeared to be the Semaphorin I 

homologue, and the other a second related gene. The complete ORF sequence of 
the D-Semaphorin I homologue revealed an overall structure identical to G- 
Semaphorin I: a signal sequence, an extracellular domain of around 550 amino 
acids containing 16 cysteines, a transmembrane domain of 25 amino acids, and a 
10 cytoplasmic domain of 117 amino acids. When we had finished the sequence for 
D-Semaphorin II, we were able to begin to run homology searches in the data 
base, which revealed some of its structural features further described herein. The 
Semaphorin II sequence revealed a different structure: a signal sequence of 16 
amino acids, a -525 amino acid domain containing 16 cysteines, with a single 
15 immunoglobulin (Ig) domain of 66 amino acids, followed by a short unique region 
of 73 amino acids. There is no evidence for either a transmembrane domain or a 
potential phospholipid linkage in the C-terminus of this protein. Thus, it appears 
that the D-Semaphorin II protein is secreted from the cells that produce it. The 
grasshopper, Tribolium, and Drosophila Semaphorin I cDNA sequences, as well as 
20 the sequence of the D-Semaphorin II cDNA, are shown herein. In addition, we 
used this same technique to identify Semaphorin I genes in a moth, Manduca sexta, 
a locust, Locusta migratoria, and a cricket, Acheta domestica. 

With this large family of insect Semaphorin genes, we identified a number 
of good stretches of the right amino acids (with the least degeneracy based on their 
25 codons) with strong homology for designing primers for PCR to look for human 
genes. We designed a set of oligonucleotide primers, and plated out several human 
cDNA libraries: a fetal brain library (Stratagene), and an adult hippocampus 
library. We ultimately obtained a human cDNA PCR bands of the right size that 
did not autoprime and thus were good candidates to be bonafide Semaphorin-like 
30 cDNAs from humans. These bands were purified, subcloned, and sequenced. 

Whole-mount in situ hybridization experiments showed that D-Semaphorin I 
and II are expressed by different subsets of neurons in the embryonic CNS. D- 
Semaphorin I is expressed by certain cells along the midline as well as by other 
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neurons, whereas D-Semaphorin n is not expressed 3Wne midline, but is expressed 
by a different subset of neurons. In addition, D-Semaphorin II is expressed by a 
subset of muscles prior to and during the period of innervation by specific 
motoneuron. On the polytene chromosomes, the D-Semaphorin I gene maps to 
5 (gene-band-chromosome) 29E1-22L and that of D-Semaphorin II to 53C9-102R. 
We have identified loss of function mutations in the D-Semaphorin I gene and a 
pair of P-element transposon insertions in the D-Semaphorin II gene which appear 
to cause severe phenotypes. 

When we lined up the G-Semaphorin I, T-Semaphorin I, D-Semaphorin I, 
and D-Semaphorin II sequences and ran the sequences through a sequence data 
base in search of other sequences with significant similarity, we discovered a 
curious finding: these Semaphores share sequence similarity with the A39R open 
reading frame (ORF) from Vaccinia virus and the A43R ORF from Variola Major 
(smallpox) virus and we discovered that the amino acids shared with the virus ORF 
15 were in the same regions where the insect proteins shared their greatest similarity. 
The viral ORF began with a putative signal sequence, continued for several 
hundred amino acids with sequence similarity to the Semaphorin genes, and then 
ended without any membrane linkage signal (suggesting that the protein as made by 
the infected cell would likely be secreted). 
20 We reasoned that the virus semaphorins were appropriated host proteins 

advantageously exploited by the viruses, which would have host counterparts that 
most likely function in the immune system to inhibit or decrease an immune 
response, just as in the nervous system they appear to function by inhibiting 
growth cone extension. Analogous to situations where viruses are thought to 
25 encode a secreted form of a host cellular receptor, here the virus may cause the 
infected cell to make a lot of the secreted ligand to mimic an inhibitory signal and 
thus help decrease the immune response. 



m - Isolation and characte rization of Murine CNS Semaphorin HI Recep tor 
30 using Epito pe Tagged Human Semaphorin TTT f hSTTT) 

mRNA was isolated from murine fetal brain tissue and used to construct a 
cDNA library in a mammalian exprssion vector, pCMX, essentially as in Davis et 
al. (1991) Science 253, 59. 
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The transf^^n and screening procedure is modi^Rxom Lin et al (1992) 
Cell 68, 775. COS cells grown on glass slide flaskettes are transfected with pools 
of the cDNA clones, allowed to bind radioiodinated hSIII truncated at the C- 
terminus end of the semaphorin domain. In parallel, similarly treated COS cells 
5 are allowed to bind unlabelled human semaphorin III truncated at the C-terminus 
end of the semaphorin domain and there joined to a 10-amino acid extension 
derived from the human c-myc proto-oncogene product. This modified hSIH 
allows the identification of hSIII receptors with the use of the tagged ligand as a 
bridge between the receptor and a murine monoclonal antibody which is specific 

10 for an epitope in the c-myc tag. Accordingly, after binding unlabelled hSIII the 
cells are exposured to the monoclonal which may be labeled directly or 
subsequently decorated with a secondary anti-mouse labeled antibody for enhanced 
signal amplification. 

Cells are then fixed and screened using dark-field microsopy essentially as 

15 in Lin et al. (supra). Positive clones are identified and sequence analysis of 

murine CNS Semphorin III receptor cDNA clones by the dideoxy chain termination 
method is used to construct full-length receptor coding sequences. 

IV. Protocol for Protein-Protein H-Sema III - H-Sema III Receptor Drug 
20 Screening Assay. 
A. Reagents: 

- Neutralite Avidin : 20 fig/ml in PBS. 

- Blocking buffer : 5% BSA, 0.5% Tween 20 in PBS; 1 hr, RT. 

- Assay Buffer : 100 mM KC1, 20 mM HEPES pH 7.6, 0.25 mM EDTA, 1% 
25 glycerol, 0.5 % NP-40, 50 mM BME, 1 mg/ml BSA, protease inhibitor cocktail. 

- 33 P H-Sema III lOx stock : 10" 8 - 10** M "cold" truncated (Semaphorin domain) H- 
Sema in supplemented with 50,000-500,000 cpm of labeled and truncated H-Sema 
in (Beckman counter). Store at 4°C during screening. 

- Protease inhibitor cocktail flOPX^ : 1 mg Trypsin Inhibitor (BMB # 109894), 1 
30 mg Aprotinin (BMB # 236624), 2.5 mg Benzamidine (Sigma # B-6506), 2.5 mg 

Leupeptin (BMB ft 1017128), 1 mg APMSF (BMB # 917575), and 0.2m M NaVo 3 
(Sigma # S-6508) in 10 ml of PBS. 
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- H-Sema in Keceptor: 10' 8 - 10"' M of biotinylatefflPsema III biotinylated 
receptor in PBS. 

B. Preparation of assay plates: 

- Coat with 120 pi of stock N-Avidin per well at least 1 hr at 25 °C or 
5 overnight at 4°C. 

- Wash 2X with 200 pi PBS. 

- Block with 150 pi of blocking buffer. 

- Wash 2X with 200 pi PBS. 

C. Assay: 

10 - Add 40 pi assay buffer/well. 

- Add 10 pi candidate agent. 

- Add 10 m1 33 P-H-Sema III (5,000-50,000 cpm/0.1-10 pmoles/well =10" 9 - 
10' 7 M final concentration). 

- Mix 

15 - Incubate 1 hr. at 25 °C. 

- Add 40 pi H-Sema m receptor (0.1-10 pmoles/40 ul in assay buffer) 

- Incubate 1 hr at 25 °C. 

- Stop the reaction by washing 4X with 200 pi PBS. 

- Add 150 pi scintillation cocktail. 
20 - Count in Topcount. 

D. Assay controls (located on each plate): 

a. Non-specific binding (no receptor added) 

b. Soluble (non-biotinylated receptor) at 80% inhibition. 



It is evident from the above results that one can use the methods and 
compositions disclosed herein for making and identifying diagnostic probes and 
therapeutic drugs. It will also be clear to one skilled in the art from a reading of 
this disclosure that advantage can be taken to effect alterations of semaphorin 
responsiveness in a host. 

All publications and patent applications cited in this specification are herein 
incorporated by reference as if each individual publication or patent application 
were specifically and individually indicated to be incorporated by reference. 
Although the foregoing invention has been described in some detail by way of 
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illustration and exafte for purposes of clarity of undersKng, it will be readily 
apparent to those of ordinary skill in the art in light of the teachings of this 
invention that certain changes and modifications may be made thereto without 
departing from the spirit or scope of the appended claims. 
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SEQUENCE ^SnNGS : 

Sequences 53-68 show the nucleotide and deduced amino-acid sequences of 
human semaphorin III, vaccinia virus semaphorin IV, grasshopper semaphorin I, 
Drosophila semaphorin I, Drosophila semaphorin II, Tribolium semaphorin I and 

5 variola major virus semaphorin IV. 



15 



25 



35 



40 



50 



60 



SEQUENCE LISTING 



10 (1) GENERAL INFORMATION: 



(i) APPLICANT: Goodman, Corey S. 

Kolodkin, Alex L. 
Matthes , David 
Bentley, David R. 
O ' Connor , T imot hy 



(ii) TITLE OF INVENTION: The Semaphorin Gene Family 
20 (iii) NUMBER OF SEQUENCES : 66 



(iv) CORRESPONDENCE ADDRESS: 

(A) ADDRESSEE: FLEHR HOHBACH TEST ALBRITTON & HERBERT 

(B) STREET: 4 Embarcadero Center, Suite 3400 

(C) CITY: San Francisco 

(D) STATE: CA 

(E) COUNTRY: USA 

(F) ZIP: 94111-4187 



30 (v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.25 



(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: Not yet assigned 

(B) FILING DATE: 13-SEP-1994 

(C) CLASSIFICATION: 



(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Osman, Richard A. 

(B) REGISTRATION NUMBER: 36,627 

45 (C) REFERENCE/DOCKET NUMBER: FP-58750-PC/RAO 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: (415) 781-1989 

(B) TELEFAX : (415) 398-3249 

(C) TELEX: 910 277299 FHT UR 



(2) INFORMATION FOR SEQ ID NO:l: 



(i) SEQUENCE CHARACTERISTICS: 
55 (A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 
( ix ) FEATURE : 
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(A) l^(f/KEY: Peptide 

(B) LOCATION: 1..6 

(D) OTHER INFORMATION: /label= SEQ01 

/note= "Xaa denotes D or E at residue #1; Q,K,R,A 
5 or N at residue #3; and Y,F or V at residue #5" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 

Xaa Cys Xaa Asn Xaa lie 
10 1 5 



(2) INFORMATION FOR SEQ ID NO : 2 : 

15 (i) SEQUENCE CHARACTERISTICS: 

( A ) LENGTH : 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



20 



30 



35 



60 



(ii) MOLECULE TYPE: peptide 



( ix ) FEATURE : 

(A) NAME /KEY: Peptide 

25 (B) LOCATION: 1..6 

(D) OTHER INFORMATION: /label= SEQ02 

/note= "Xaa denotes Q, K, R, A or N at residue #2; 

Y , F or V at residue #4; and R,K,Q or T at residue 
#6" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Cys Xaa Asn Xaa lie Xaa 
1 5 



(2) INFORMATION FOR SEQ ID NO: 3: 



(i) SEQUENCE CHARACTERISTICS: 
40 (A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

45 (ii) MOLECULE TYPE: peptide 

( ix ) FEATURE : 

(A) NAME /KEY : Peptide 
<B) LOCATION: 1..7 
50 (D) OTHER INFORMATION: /label= SEQ03 

/note= "Xaa denotes N or G at residue #4; A,S or N 
at residue #5; Y , F , H or G at residue #6; and 
K, R, H , N or Q at residue #7 W 

55 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

Cys Gly Thr Xaa Xaa Xaa Xaa 
1 5 



(2) INFORMATION FOR SEQ ID NO: 4: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 8 amino acids 
65 (B) TYPE: amino acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 
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10 



15 



25 



30 



35 



(ii) MOETCULE TYPE: peptide' 

( ix ) FEATURE : 

(A) NAME /KEY : Peptide 

(B) LOCATION: 1..8 

(D) OTHER INFORMATION: /label= SEQ04 

/note= "Xaa denotes N or G at residue #4; and A f S 
or N at residue #5" 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Cys Gly Thr Xaa Xaa Xaa Xaa Pro 
1 5 

(2) INFORMATION FOR SEQ ID NO: 5: 



(i) SEQUENCE CHARACTERISTICS: 
- n (A) LENGTH: 10 amino acids 

(B) TYPE: amino acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 

( ix ) FEATURE : 

(A) NAME/KEY: Peptide 

(B) LOCATION: 1..10 

(D) OTHER INFORMATION: /label= SEQ05 

/note= "Xaa denotes N or G at residue #4; and C or 
D at residue #10" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 

Cys Gly Thr Xaa Xaa Xaa Xaa Pro Xaa Xaa 
1 5 10 



40 



(2) INFORMATION FOR SEQ ID NO: 6: 



(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

Asz ( C ) STRANDEDNESS : single 

4 ~> (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



50 



55 



60 



65 



(ix) FEATURE: 

(A) NAME /KEY : Peptide 

(B) LOCATION: 1..13 

(D) OTHER INFORMATION: /label= SEQ06 

/note= "Xaa denotes C or D at residue #10; and Y 
or I at residue #13" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Cys Gly Thr Xaa Xaa Xaa Xaa Pro Xaa Xaa Xaa Xaa Xaa 

(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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( ii ) MOLECufl^TYPE : peptide 

( ix ) FEATURE : 

(A) NAME /KEY : Peptide 

(B) LOCATION: 1..7 

(D) OTHER INFORMATION: /label= SEQ07 

/note= "Xaa denotes R, I, Q or V at residue #1; G or 
A at residue #2; L,V or K at residue #3; C or S at 
residue #4; F or Y at residue #6? and D or N at 
residue #7" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

Xaa Xaa Xaa Xaa Pro Xaa Xaa 
1 5 
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(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : s ingle 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

( ix ) FEATURE : 

(A) NAME /KEY : Peptide 

(B) LOCATION: 1..7 

(D) OTHER INFORMATION: /label= SEQ08 

/note= "Xaa denotes C or S at residue #1; F or Y 
at residue #3; D or N at residue #4; D , E , R or K at 
residue #6; and H,L or D at residue #7" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8: 



Xaa Pro Xaa Xaa Pro Xaa Xaa 
1 5 

(2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

( B ) TYPE : amino acid 

( C ) STRANDEDNESS : s ingle 

( D ) TOPOLOGY : linear 

(ii) MOLECULE TYPE: peptide 

( ix ) FEATURE : 

( A ) NAME /KEY : Pept ide 

(B) LOCATION: 1..9 

(D) OTHER INFORMATION: /label= SEQ09 

/note= "Xaa denotes G or A at residue #3; C or S 
at residue #5; and D or N at residue #8" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 

Gly Xaa Xaa Xaa Xaa Pro Tyr Xaa Pro 
1 5 



65 (2) INFORMATION FOR SEQ ID NO: 10: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 7 amino acids 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(ix) FEATURE: 

(A) NAME/KEY: Peptide 

(B) LOCATION: 1. .7 

(D) OTHER INFORMATION: /label= SEQ10 

/note= "Xaa denotes F or Y at residue #2; G or A 
at residue #4; and V,N or A at residue #6" 



j ^ (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 

Leu Xaa Ser Xaa Thr Xaa Ala 
1 5 



20 (2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 

30 ( ix ) FEATURE : 

(A) NAME/KEY: Peptide 

(B) LOCATION: 1..9 

<D) OTHER INFORMATION: /label* SEQ11 

/note* "Xaa denotes F or Y at residue #2; D 
OJ at residue #8; and F or Y at residue #9" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 
Leu Xaa Ser Xaa Thr Xaa Ala Xaa Xaa 



(2) INFORMATION FOR SEQ ID NO: 12: 

45 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: peptide 



( ix ) FEATURE : 

(A) NAME/KEY: Peptide 

(B) LOCATION: 1. .8 

(D) OTHER INFORMATION: /label* SEQ12 

/note= "Xaa denotes F or Y at residue #1; G or A 
at residue #3; V,N or A at residue #5; D or E at 
residue #7; and F or Y at residue #8" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 12: 

Xaa Ser Xaa Thr Xaa Ala Xaa Xaa 
1 5 



(2) INFORMATION FOR SEQ ID NO: 13; 
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( i ) SEQUE HARACTERISTICS : 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D ) TOPOLOGY : linear 



(ii) MOLECULE TYPE: peptide 

(ix) FEATURE: 

10 (A) NAME/KEY: Peptide 

(B) LOCATION: 1..7 



(D) OTHER INFORMATION: /label= SEQ13 

/note= "Xaa denotes N or D at residue #2; and A or 
K at residue #3" 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 13: 

Leu Xaa Xaa Pro Asn Phe Val 
1 5 

(2) INFORMATION FOR SEQ ID NO: 14: 



(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 5 amino acids 

( B ) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

30 (ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 14: 
Phe Phe Phe Arg Glu 

35 l s 



(2) INFORMATION FOR SEQ ID NO: 15: 

40 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D ) TOPOLOGY : linear 
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( ii ) MOLECULE TYPE : peptide 



( ix ) FEATURE : 

(A) NAME/KEY: Peptide 
50 (B) LOCATION: 1..6 

(D) OTHER INFORMATION: /label= SEQ15 

/note= "Xaa denotes F or Y at residue #3; and T or 
N at residue #6" 

55 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 15: 



Phe Phe Xaa Arg Glu Xaa 
1 5 



(2) INFORMATION FOR SEQ ID NO: 16: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 6 amino acids 
65 (B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) Mo2BP&LE TYPE: peptide 

<ix) FEATURE: 

(A) NAME/KEY: Peptide 

(B) LOCATION: 1..6 

(D) OTHER INFORMATION: /label= SEQ16 

/note= "Xaa denotes T or N at residue #5" 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 

Phe Phe Arg Glu Xaa Ala 
1 5 



15 (2) INFORMATION FOR SEQ ID NO: 17: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 
on (B) TYPE: amino acid 

^ U (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

25 ( ix ) FEATURE : 

(A) NAME /KEY: Peptide 

(B) LOCATION: 1..6 

(D) OTHER INFORMATION: /label= SEQ17 

/note= "Xaa denotes F or Y at residue #2; and T or 



N at residue #5' 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 



Phe Xaa Arg Glu Xaa Ala 
35 1 5 



(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

FEATURE : 

(A) NAME/KEY: Peptide 

(B) LOCATION: 1..6 

(D) OTHER INFORMATION: /label= SEQ18 

/note= "Xaa denotes F or Y at residue #4" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 

Tyr Phe Phe Xaa Arg Glu 
1 5 



(ix) 



50 



60 (2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
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( ix ) featurHt^ 

(A) NAME/KEY: Peptide 

(B) LOCATION: 1..6 

(D) OTHER INFORMATION: /label= SEQ19 
5 /note= "Xaa denotes F or Y at residue #1; and F or 

Y at residue #4" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 19: 

10 Xaa Phe Phe Xaa Arg Glu 

1 5 



15 



(2) INFORMATION FOR SEQ ID NO: 20: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
20 ( D ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: peptide 

( ix ) FEATURE : 
25 (A) NAME /KEY: Peptide 

(B) LOCATION: 1..7 

(D) OTHER INFORMATION: /label* SEQ20 

/note= »xaa denotes F or Y at residue #1; F or Y 
at residue #2; F or Y at residue #3; and T or N at 
30 residue #6" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:20: 

Xaa Xaa Xaa Arg Glu Xaa Ala 
35 1 5 



(2) INFORMATION FOR SEQ ID NO: 21: 

40 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 
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(ii) MOLECULE TYPE: peptide 



( ix ) FEATURE : 

(A) NAME/KEY: Peptide 
50 (B) LOCATION: 1..7 

(D) OTHER INFORMATION: /label* SEQ21 

/note* "Xaa denotes I or V at residue #1; F or Y 
at residue #2; F or Y at residue #4; and F or Y at 
residue #5 " 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:21: 

Xaa Xaa Phe Xaa Xaa Arg Glu 
1 5 

(2) INFORMATION FOR SEQ ID NO: 22: 



(i) SEQUENCE CHARACTERISTICS: 
65 (A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



49 



10 



15 



WO 95/07706 PCT/US94/10151 

# A 

(ii) MOLECULE TYPE: peptide ^Br 

( ix ) FEATURE : 

(A) NAME /KEY : Peptide 
3 (B) LOCATION: 1..7 

(D) OTHER INFORMATION: /label= SEQ22 

/note= "Xaa denotes K,F or Y at residue #2; F or Y 
at residue #4; F, Y, I or L at residue #5; F,Y,I or 
L at residue #6; and F or Y at residue #7" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: 

Asp Xaa Val Xaa Xaa Xaa Xaa 
1 5 

(2) INFORMATION FOR SEQ ID NO: 23: 

(i) SEQUENCE CHARACTERISTICS : 
20 <A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

25 (ii) MOLECULE TYPE: peptide 

( ix) FEATURE: 

(A) NAME/KEY: Peptide 

(B) LOCATION: 1..8 

(D) OTHER INFORMATION: /label- SEQ23 

/note= "Xaa denotes V or I at residue #1; F or Y 
at residue #2; F,Y,I or L at residue #3; F,Y,I or 
L at residue #4; R or T at residue #6; and T or N 
at residue #8" 



30 



40 



35 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 23: 

Xaa Xaa Xaa Xaa Phe Xaa Xaa Xaa 
1 5 

(2) INFORMATION FOR SEQ ID NO: 24: 

- (i) SEQUENCE CHARACTERISTICS: 

4 -> (A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

50 (ii) MOLECULE TYPE: peptide 

( ix ) FEATURE : 

(A) NAME/KEY: Peptide 

(B) LOCATION: 1..8 

55 (D) OTHER INFORMATION: /label= SEQ24 

/note= "Xaa denotes V or I at residue #1; F or Y 
at residue #2; F, Y, I or L at residue #3; F , Y , I or 
L at residue #4; F or Y at residue #5; R or T at 
residue #6; E,D or V at residue #7; and T or N at 
residue #8" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:24: 



60 



„ Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 

65 i 5 



(2) INFORMATION FOR SEQ ID NO: 25: 



50 



BNSOOCIO: <WO_©6aT706A1JL> 



15 



20 



40 




WO 95/07706 ^ PCT/US94/10151 

( i ) SEQUEJ^^HARACTERISTICS : 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
5 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

( ix ) FEATURE : 
10 (A) NAME /KEY: Peptide 

(B) LOCATION: 1..7 

(D) OTHER INFORMATION: /label= SEQ25 

/note= »xaa denotes F or Y at residue #2; and C or 
S at residue #5" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 25: 

Glu Xaa lie Asn Xaa Gly Lys 
1 5 

(2) INFORMATION FOR SEQ ID NO: 26: 



(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

( C) STRANDEDNESS : single 

( D ) TOPOLOGY : linear 

30 (ii) MOLECULE TYPE: peptide 

( ix ) FEATURE : 

(A) NAME /KEY : Peptide 

(B) LOCATION: 1..7 

35 (D) OTHER INFORMATION: /label= SEQ26 

/note= "Xaa denotes F or Y at residue #1; and A,V 
or I at residue #7" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:26: 

Xaa lie Asn Cys Gly Lys Xaa 
1 5 



45 (2) INFORMATION FOR SEQ ID NO: 27: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

50 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

55 (ix) FEATURE: 

( A ) NAME /KEY : Pept ide 

(B) LOCATION: 1..7 

(D) OTHER INFORMATION: /label= SEQ27 

/note= "Xaa denotes V or I at residue #2; A or G 
60 at residue #3; R or Q at residue #4; and V or I 

residue #5" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:27: 

65 Arg Xaa Xaa Xaa Xaa Cys Lys 

1 5 



51 



WO 95/07706 PCT/US94/10151 

(2) INFORMA^Sn FOR SEQ ID NO: 28: 

(i) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 9 amino acids 
5 (B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(ix) FEATURE: 

(A) NAME /KEY: Peptide 

(B) LOCATION: 1..9 

<D) OTHER INFORMATION: /labels SEQ28 

/note= "Xaa denotes V or I at residue #2; R or Q 
at residue #4; and V or I at residue #5" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:28: 

20 Arg Xaa Xaa Xaa Xaa Cys Xaa Xaa Asp 

15 



10 



15 



25 



30 



35 



40 



45 



(2) INFORMATION FOR SEQ ID NO: 29: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

(ix) FEATURE: 

(A) NAME/KEY: Peptide 

(B) LOCATION: 1..13 

(D) OTHER INFORMATION: /label= SEQ29 

/note= "Xaa denotes V,A or I at residue #3; and 
V,A or I at residue #8" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 29: 

Gly Lys Xaa Xaa Xaa Xaa Arg Xaa Xaa Xaa Xaa Cys Lys 
1 5 io 

(2) INFORMATION FOR SEQ ID NO: 30: 



(i) SEQUENCE CHARACTERISTICS: 
50 (A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

55 (ii) MOLECULE TYPE: peptide 

(ix) FEATURE: 

(A) NAME/KEY: Peptide 

(B) LOCATION: 1..7 

60 (D) OTHER INFORMATION: /label= SEQ30 

/note= "Xaa denotes R,K or N at residue #1; T , A or 
S at residue #3; T,A or S at residue #4; F,Y or L 
at residue #5; and K or R at residue #7" 

65 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 30: 

Xaa Trp Xaa Xaa Xaa Leu Xaa 
1 5 



52 
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10 



15 



20 



(2) INFORMATION^* SEQ ID NO: 31: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : single 

( D ) TOPOLOGY : linear 

(ii) MOLECULE TYPE: peptide 

( ix ) FEATURE : 

(A) NAME/KEY: Peptide 

(B) LOCATION: 1. .8 

(D) OTHER INFORMATION: /label= SEQ31 

/note= "Xaa denotes F or Y at residue #1; K or R 
at residue #3; A or S at residue #4; and N or I at 
residue #7" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 31: 

Xaa Leu Xaa Xaa Arg Leu Xaa Cys 
1 5 



25 (2) INFORMATION FOR SEQ ID NO: 32: 



30 



35 



40 



45 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

( B ) TYPE : amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

( ix ) FEATURE : 

(A) NAME/KEY: Peptide 

(B) LOCATION: 1. • 6 

(D) OTHER INFORMATION: /label= SEQ32 

/note= "Xaa denotes N or I at residue #1; I or V 
at residue #4; and P or S at residue #5" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: 

Xaa Cys Ser Xaa Xaa Gly 
1 5 



50 



55 



60 



65 



(2) INFORMATION FOR SEQ ID NO: 33: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

( ix ) FEATURE : 

(A) NAME /KEY : Peptide 

(B) LOCATION: 1. .9 

(D) OTHER INFORMATION: /label* SEQ33 

/note= "Xaa denotes T,A or S at residue #2; T,A or 
S at residue #3; F,Y or L at residue #4; and 
A,S,V,I or L at residue #7 M 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 33: 



53 



BNSOOCID: <WO_96O7706A1JU> 



WO 95/07706 _ PCT/US94/10151 



Trp Xa^La Xaa Leu Lys Xaa Xaa Leu 

1 5 



5 (2) INFORMATION FOR SEQ ID NO: 34: 



10 



20 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 

15 (ix) FEATURE: 

(A) NAME /KEY: Peptide 

(B) LOCATION: 1..11 

(D) OTHER INFORMATION: /label= SEQ34 

/note= "Xaa denotes T,A or S at residue #2; and 

T,A or S at residue #3" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:34: 

Trp Xaa Xaa Xaa Leu Lys Xaa Xaa Leu Xaa Cvs 
25 1 5 10 



(2) INFORMATION FOR SEQ ID NO: 35: 

30 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : single 
35 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

FEATURE : 

(A) NAME /KEY: Peptide 

(B) LOCATION: 1..11 

(D) OTHER INFORMATION: /label= SEQ35 
/note= "Xaa denotes T or S at residue #3" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 35: 

Trp Xaa Xaa Xaa Leu Lys Xaa Xaa Leu Xaa Cys 
1 5 10 



(ix) 



40 



50 (2) INFORMATION FOR SEQ ID NO: 36: 

SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

60 (ix) FEATURE: 

(A) NAME/KEY: Peptide 

(B) LOCATION: 1. .7 

(D) OTHER INFORMATION: /label= SEQ36 
srer /note* "Xaa denotes F or Y at residue #1; F or Y 

DJ at residue #2; and N or D at residue #3" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:36: 



<i> 



55 



54 
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30 



45 



50 



aHWlu lie Gin Ser 



Xaa Xaa Xa 1 
1 5 

5 (2) INFORMATION FOR SEQ ID NO: 37: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

10 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

15 (ix) FEATURE: 

(A) NAME/KEY: Peptide 

(B) LOCATION: 1..7 

(D) OTHER INFORMATION: /label= SEQ37 

/note= "Xaa denotes F or Y at residue #1; F or Y 
90 at residue #3; F or Y at residue #4; F or Y at 

residue #5; and N or D at residue #6" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:37: 

25 Xaa Pro Xaa Xaa Xaa Xaa Glu 

1 5 



(2) INFORMATION FOR SEQ ID NO: 38: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : s ingle 
35 ( d ) TOPOLOGY : 1 inear 

(ii) MOLECULE TYPE: peptide 

( ix ) FEATURE : 
40 (A) NAME/KEY: Peptide 

(B) LOCATION: 1..7 

(D) OTHER INFORMATION: /label= SEQ38 

/note= "Xaa denotes V,I or L at residue #4; and F 
or Y at residue #7" 



(Xi) SEQUENCE DESCRIPTION: SEQ ID NO:38: 

Gly Ser Ala Xaa Cys Xaa Xaa 
1 5 

(2) INFORMATION FOR SEQ ID NO: 39: 



(i) SEQUENCE CHARACTERISTICS: 
55 (A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

( C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 

60 (ii) MOLECULE TYPE: peptide 

(ix) FEATURE: 

(A) NAME /KEY : Peptide 

(B) LOCATION: 1..8 

65 (D) OTHER INFORMATION: /label* SEQ39 

/note= »xaa denotes V,I or L at residue #3; and F 
or Y at residue #6" 



55 
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WO 95/07706 PCT/DSM/10151 

(xi) SE^^NCE DESCRIPTION: SEQ ID NO: 3 9^ 
Ser Ala Xaa Cys Xaa Xaa Xaa Met 

5 

(2) INFORMATION FOR SEQ ID NO: 40: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: peptide 

( ix ) FEATURE : 

(A) NAME/KEY: Peptide 

(B) LOCATION: 1..7 

20 (D) OTHER INFORMATION: /label= SEQ40 

/note= "Xaa denotes N or A at residue #3; and P or 
A at residue #6" 

25 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 40: 

Asn Ser Xaa Trp Leu Xaa Val 
1 5 

30 (2) INFORMATION FOR SEQ ID NO: 41: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 7 amino acids 
^ (B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

40 (ix) FEATURE: 

(A) NAME/KEY: Peptide 

(B) LOCATION: 1..7 

(D) OTHER INFORMATION: /label= SEQ41 

/note= "Xaa denotes V,L or I at residue #1; and 
E,D,Y,S or F at residue #3" 



45 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 41: 



Xaa Pro Xaa Pro Arg Pro Gly 
50 1 5 



(2) INFORMATION FOR SEQ ID NO: 42: 

55 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 9 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
6Q (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

( ix ) FEATURE : 
^ (A) NAME/KEY: Peptide 

CO (B) LOCATION: 1..9 



(D) OTHER INFORMATION: /label= SEQ42 

/note= "Xaa denotes V,L or I at residue #1; and R 
or A at residue #5" 



56 
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25 





(xi) SEQUEnSKeSCRIPTION: SEQ ID NO: 42: 



Xaa Pro Xaa Pro Xaa Pro Gly Xaa Cys 
1 5 

5 

(2) INFORMATION FOR SEQ ID NO: 43: 

(i) SEQUENCE CHARACTERISTICS: 
10 (A) LENGTH: 8 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: peptide 

( ix ) FEATURE : 

(A) NAME /KEY: Peptide 

(B) LOCATION: 1..8 

20 ( D ) OTHER INFORMATION: /label= SEQ43 

/note= "Xaa denotes E , D , Y , S or F at residue #2 ; 
and T f Q or S at residue #7" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 43: 

Pro Xaa Pro Arg Pro Gly Xaa Cys 
1 5 



30 (2) INFORMATION FOR SEQ ID NO: 44: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 

(B) TYPE: amino acid 

35 (C) STRANDEDNESS : single 

( D ) TOPOLOGY : linear 

(ii) MOLECULE TYPE: peptide 

40 ( ix ) FEATURE : 

(A) NAME /KEY : Peptide 

(B) LOCATION: 1. .6 

(D) OTHER INFORMATION: /label= SEQ44 

/note= "Xaa denotes H,F or Y at residue #3; and A 
45 or G at residue #5" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:44: 

Asp Pro Xaa Cys Xaa Trp 
50 1 5 



(2) INFORMATION FOR SEQ ID NO: 45: 

55 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 6 amino acids 
( b ) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D ) TOPOLOGY : linear 



60 



(ii) MOLECULE TYPE: peptide 



( ix ) FEATURE : 

(A) NAME /KEY : Peptide 
65 (B) LOCATION: 1-.6 

(D) OTHER INFORMATION: /label= SEQ4 5 

/note= "Xaa denotes H,F or Y at residue #2; and A 
or G at residue #4" 



57 
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(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 45 
Pro Xaa Cya Xaa Trp Asp 

5 

(2) INFORMATION FOR SEQ ID NO: 46: 

(i) SEQUENCE CHARACTERISTICS: 
!0 (A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: peptide 

(ix) FEATURE: 

(A) NAME/KEY: Peptide 

(B) LOCATION: 1..7 

20 (D) OTHER INFORMATION: /label= SEQ46 

/note= "Xaa denotes A or G at residue #5" 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 46: 

25 Asp Pro Xaa Cys Xaa Trp Asp 

1 5 



(2) INFORMATION FOR SEQ ID NO: 47: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

~>5 (B) TOPOLOGY: 1 i n«r- 



( D ) TOPOLOGY : linear 

(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 47: 

Cys Xaa Xaa Xaa Xaa Asp Pro Xaa Cys Xaa Trp Asp 
1 5 io 



45 (2) INFORMATION FOR SEQ ID NO: 48: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
55 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:48: 



Cys Xaa Xaa Xaa Asp Pro Xaa Cys Xaa Trp Asp 
1 5 io 

(2) INFORMATION FOR SEQ ID NO: 49: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10 amino acids 
&5 (B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 



58 
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WO 95/07706 ^ FCT/OSM/10U1 

(ii) MOLECUS^TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 49: 

5 Cvs Xaa Xaa Asp Pro Xaa Cys Xaa Trp Asp 

1 5 10 

(2) INFORMATION FOR SEQ ID NO: 50: 

10 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 

( C ) STRANDEDNESS : s ingle 
15 ( d ) TOPOLOGY : linear 



20 



40 



50 



55 



(ii) MOLECULE TYPE: peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 50: 

Cys Xaa Xaa Cys Xaa Xaa Xaa Xaa Asp Xaa Xaa Cys Xaa Trp Asp 
1 5 10 15 



25 (2) INFORMATION FOR SEQ ID NO: 51: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 14 amino acids 

( B) TYPE : amino acid 

30 ( C ) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

35 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 51: 



Cys Xaa Xaa Cys Xaa Xaa Xaa Asp Xaa Xaa Cys Xaa Trp Asp 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 52: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 13 amino acids 
45 (B) TYPE: amino acid 

( C ) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



( ii ) MOLECULE TYPE : peptide 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 52: 

Cys Xaa Xaa Cys Xaa Xaa Asp Xaa Xaa Cys Xaa Trp Asp 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 53: 



(i) SEQUENCE CHARACTERISTICS: 
60 (A) LENGTH: 2601 base pairs 

(B) TYPE: nuclr c acid 

(C) STRANDEDNE . i double 

( D ) TOPOLOGY : i inear 

65 (ii) MOLECULE TYPE: cDNA 

( ix ) FEATURE : 

(A) NAME/KEY: CDS 



59 
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(IT^t-OCATION: 16.. 2331 

(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 53: 

GGAATTCCCT GCAGC ATG GGC TGG TTA ACT AGG ATT GTC TGT CTT TTC TGG 
Met Gly Trp Leu Thr Arg lie Val Cys Leu Phe Trp 
1 5 10 



51 



10 Sli Sit T T I A r CTT ^ «f A AGA GCA ^ TAT CAG GGG AAG AAC AAT 99 

1U Gly Val Leu Leu Thr Ala Arg Ala Asn Tyr Gin Asn Gly Lys Asn Asn 

■ Li> 20 25 



Sal pS t CTG ?** P A TCC TAC *** GAA ATG TTG GAA TCC AAC AAT 

15 5n 9 LyB LeU Ser Tyr Lys Glu Met Leu Glu Ser Asn Asn 

J JO 35 40 



vll Ilf Th! II °? C TTG GCC AAC AGC TCC AGT TAT C AT ACC TTC 

Val He Thr Phe Asn Gly Leu Ala Asn Ser Ser Ser Tyr His Thr Phe 

20 50 55 60 

CTT TTG GAT GAG GAA CGG AGT AGG CTG TAT GTT GGA GCA AAG GAT CAC 

Leu Leu Asp Glu Glu Arg Ser Arg Leu Tyr Val Gly Ala L^s £J SS 

65 70 7 f 



25 



AXA TTT TCA TTC GAC CTG GTT AAT ATC AAG GAT TTT CAA AAG ATT GTG 
He Phe Ser Phe Asp Leu Val Asn He Lys Asp Phe Gin £Js III vlt 
80 85 go 



Jin ?Tf T° TG ?** GAA TGT GCT ** T TTC ATC ^ GTA CTT AAG GCA TAT 

35 P Lo yS ° yS nf Aen Phe lle LyB J|J Leu ^ 8 A1 * Sr 

** T CAG ACT CAC TTG TAC GCC TGT GGA ACG GGG GCT TTT CAT CCA ATT 
Asn Gin Thr His Leu Tyr Ala Cys Gly Thr Gly Ala III Ss Pro ill 

40 135 140 

Ivs Thr £r H~ GAA ^ °? A CAT CAT CCT GAG GAC *** ATT T TT AAG 
Cys Thr Tyr He Glu He Gly His His Pro Glu Asp Asn He Phe Lys 

145 150 155 



45 



CTG GAG AAC TCA CAT TTT GAA AAC GGC CGT GGG AAG AGT CCA TAT GAC 
Leu Glu Asn Ser His Phe Glu Asn Gly Arg Gly Lys Ser Pro Tyr Asp 
160 165 170 * 



65 



235 



AAT CCT GAA GAT GAC AAA GTA TAC TTT TTC TTC CGT GAA AAT GCA ATA 
Asn Pro Glu Asp Asp Lys Val Tyr Phe Phe Phe Arg Glu He 
240 245 250 



147 



195 



243 



291 



™ S GG o CA f TA TCT TAC ACC AGA AGA GAT GAA TGC AAG TGG GCT GGA AAA 3« 
30 Trp Pro Val Ser Tyr Thr Arg Arg Asp Glu Cys Lys Trp a" GlJ JJJ 39 
95 100 10 5 



387 



435 



483 



531 



CCT AAG CTG CTG ACA GCA TCC CTT TTA ATA GAT GGA GAA TTA TAC TCT 5 79 
50 Pro Lys Leu Leu Thr Ala Ser Leu Leu He Asp Gly cJi H* Tyr sir 
175 180 i 8 5 

GGA ACT GCA GCT GAT TTT ATG GGG CGA GAC TTT GCT ATC TTC CGA ACT 

55 y Ton Ala Asp Phe Gly Ar9 As p phe Ala n» A °g i£ 

■ L5 ' u 195 200 

CTT GGG CAC CAC CAC CCA ATC AGG ACA GAG CAG CAT GAT TCC AGG TGG 
Leu Gly His His His Pro He Arg Thr Glu Gin His AsJ Ser Zrg £p 
60 215 220 

CTC AAT GAT CCA AAG TTC ATT AGT GCC CAC CTC ATC TCA GAG AGT GAC 
Leu Asn Asp Pro Lys Phe He Ser Ala His Leu He Ser Glu Ser Asp 
225 230 ^ 



627 



675 



723 



771 



60 
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GAT GGA GAA CAcflfe GGA AAA GCT ACT CAC GCT AGA 9 GGT CAG ATA 819 

Asp Gly Glu His Ser Gly Lys Ala Thr His Ala Arg lie Gly Gin lie 
255 260 265 



40 



TGC AAG AAT GAG TTT GGA GGG CAC AGA ACT CTG GTG AAT AAA TGG ACA 
Cys Lys Asn Asp Phe Gly Gly His Arg Ser Leu Val Asn Lys Trp Thr 
270 275 280 



Tsp Pro Zn P. Val VaT Tyr Gly Val Phe Thr Thr Ser Ser 



867 



915 



ACA TTC CTC AAA GCT CGT CTG ATT TGC TCA GTG CCA GGT CCA AAT GGC 
10 i£ He Leu Lys Ala Arg Leu lie Cys Ser Val Pro Gly Pro Asn Gly 
285 290 295 3°° 

ATT GAC ACT CAT TTT GAT GAA CTG CAG GAT GTA TTC CTA ATG AAC TTT 963 
He Asp Thr His Phe Asp Glu Leu Gin Asp Val Phe Leu Met Asn Phe 
15 305 310 JJ-f 

AAA GAT CCT AAA AAT CCA GTT GTA TAT GGA GTG TTT ACG ACT TCC AGT 



1011 



320 

20 AAC ATT TTC AAG GGA TCA GCC GTG TGT ATG TAT AGC ATG AGT GAT GTG 1059 

£sn lie Phe Lys Gly Ser Ala Val Cys Met Tyr Ser Met Ser Asp Val 
335 340 345 

25 AGA AGG GTG TTC CTT GGT CCA TAT GCC CAC AGG GAT GGA CCC AAC TAT 1107 
Arg Arg Val Phe Leu Gly Pro Tyr Ala His Arg Asp Gly Pro Asn Tyr 
350 355 360 

CAA TGG GTG CCT TAT CAA GGA AGA GTC CCC TAT CCA CGG CCA GGA ACT 
30 Gin Trp Val Pro Tyr Gin Gly Arg Val Pro Tyr Pro Arg Pro Gly Thr 
365 370 375 

TGT CCC AGC AAA ACA TTT GGT GGT TTT GAC TCT ACA AAG GAC CTT CCT 
Cys Pro Ser Lys Thr Phe Gly Gly Phe Asp Ser Thr Lys Asp Leu Pro 
35 385 390 395 

GAT GAT GTT ATA ACC TTT GCA AGA AGT CAT CCA GCC ATG TAC AAT CCA 
Asp Asp Val lie Thr Phe Ala Arg Ser His Pro Ala Met Tyr Asn Pro 
* 400 405 410 



1155 



1203 



1251 



1299 



1347 



1395 



1443 



GTG TTT CCT ATG AAC AAT CGC CCA ATA GTG ATC AAA ACG GAT GTA AAT 
Val Phe Pro Met Asn Asn Arg Pro He Val He Lys Thr Asp Val Asn 
415 420 425 

45 TAT CAA TTT ACA CAA ATT GTC GTA GAC CGA GTG GAT GCA GAA GAT GGA 
Tyr Gin Phe Thr Gin He Val Val Asp Arg Val Asp Ala Glu Asp Gly 
430 435 440 

CAG TAT GAT GTT ATG TTT ATC GGA ACA GAT GTT GGG ACC GTT CTT AAA 
50 Gin Tyr Asp Val Met Phe He Gly Thr Asp Val Gly Thr Val Leu Lys 
445 450 455 

GTA GTT TCA ATT CCT AAG GAG ACT TGG TAT GAT TTA GAA GAG GTT CTG 
Val Val Ser He Pro Lys Glu Thr Trp Tyr Asp Leu Glu Glu Val Leu 
55 465 470 475 

CTG GAA GAA ATG ACA GTT TTT CGG GAA CCG ACT GCT ATT TCA GCA ATG 1491 
Leu Glu Glu Met Thr Val Phe Arg Glu Pro Thr Ala He Ser Ala Met 
480 485 490 

60 GAG CTT TCC ACT AAG CAG CAA CAA CTA TAT ATT GGT TCA ACG GCT GGG 1539 
Glu Leu Ser Thr Lys Gin Gin Gin Leu Tyr He Gly Ser Thr Ala Gly 
495 500 505 

65 GTT GCC CAG CTC CCT TTA CAC CGG TGT GAT ATT TAC GGG AAA GCG TGT 1587 
Val Ala Gin Leu Pro Leu His Arg Cys Asp He Tyr Gly Lys Ala Cys 
510 515 520 
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GCT GAG TGT^Bc CTC GCC CGA GAC' CCT TAG TGT^Bi TGG GAT GGT TCT 1635 
Ala Glu Cys Cys Leu Ala Arg Asp Pro Tyr Cye Ala Trp Asp Gly Ser 
525 530 535 540 

5 GCA TGT TCT CGC TAT TTT CCC ACT GCA AAG AGA CGC ACA AGA CGA CAA 1683 

Ala Cys Ser Arg Tyr Phe Pro Thr Ala Lys Arg Arg Thr Arg Arg Gin 
545 550 555 

GAT ATA AGA AAT GGA GAC CCA CTG ACT CAC TGT TCA GAC TTA CAC CAT 1731 

10 Asp lie Arg Asn Gly Asp Pro Leu Thr His Cys Ser Asp Leu His His 
560 565 570 

GAT AAT CAC CAT GGC CAC AGC CCT GAA GAG AGA ATC ATC TAT GGT GTA 1779 

Asp Asn His His Gly His Ser Pro Glu Glu Arg lie He Tyr Gly Val 
15 575 580 585 



20 



GAG AAT AGT AGC ACA TTT TTG GAA TGC AGT CCG AAG TCG CAG AGA GCG 1827 
Glu Asn Ser Ser Thr Phe Leu Glu Cys Ser Pro Lys Ser Gin Arg Ala 
590 595 600 

CTG GTC TAT TGG CAA TTC CAG AGG CGA AAT GAA GAG CGA AAA GAA GAG 1875 
Leu Val Tyr Trp Gin Phe Gin Arg Arg Asn Glu Glu Arg Lys Glu Glu 
605 610 615 620 

25 ATC AGA GTG GAT GAT CAT ATC ATC AGG ACA GAT CAA GGC CTT CTG CTA 192 3 
He Arg Val Asp Asp His He He Arg Thr Asp Gin Gly Leu Leu Leu 
625 630 635 



30 



40 



50 



60 



CGT AGT CTA CAA CAG AAG GAT TCA GGC AAT TAC CTC TGC CAT GCG GTG 1971 
Arg Ser Leu Gin Gin Lys Asp Ser Gly Asn Tyr Leu Cys His Ala Val 
640 645 650 



GAA CAT GGG TTC ATA CAA ACT CTT CTT AAG GTA ACC CTG GAA GTC ATT 2019 
Glu His Gly Phe He Gin Thr Leu Leu Lys Val Thr Leu Glu Val He 
655 660 665 



GAC ACA GAG CAT TTG GAA GAA CTT CTT CAT AAA GAT GAT GAT GGA GAT 2067 
Asp Thr Glu His Leu Glu Glu Leu Leu His Lys Asp Asp Asp Gly Asp 
670 675 680 

GGC TCT AAG ACC AAA GAA ATG TCC AAT AGC ATG ACA CCT AGC CAG AAG 2115 
Gly Ser Lys Thr Lys Glu Met Ser Asn Ser Met Thr Pro Ser Gin Lys 
685 690 695 700 

45 GTC TGG TAC AGA GAC TTC ATG CAG CTC ATC AAC CAC CCC AAT CTC AAC 2163 
Val Trp Tyr Arg Asp Phe Met Gin Leu He Asn His Pro Asn Leu Asn 
705 710 715 



ACG ATG GAT GAG TTC TGT GAA CAA GTT TGG AAA AGG GAC CGA AAA CAA 2211 
Thr Met Asp Glu Phe Cys Glu Gin Val Trp Lys Arg Asp Arg Lys Gin 
720 725 730 

CGT CGG CAA AGG CCA GGA CAT ACC CCA GGG AAC AGT AAC AAA TGG AAG 2259 



Arg Arg Gin Arg Pro Gly His Thr Pro Gly Asn Ser Asn Lys Trp Lys 
55 735 740 745 



CAC TTA CAA GAA AAT AAG AAA GGT AGA AAC AGG AGG ACC CAC GAA TTT 2307 
His Leu Gin Glu Asn Lys Lys Gly Arg Asn Arg Arg Thr His Glu Phe 
750 755 760 

GAG AGG GCA CCC AGG AGT GTC TGAGCTGCAT TACCTCTAGA AACCTCAAAC 2 3 58 

Glu Arg Ala Pro Arg Ser Val 
765 770 



65 AAGTAGAAAC TTGCCTAGAC AATAACTGGA AAAACAAATG CAATATACAT GAACTTTTTT 2418 
CATGGCATTA TGTGGATGTT TACAATGGTG GGAAATTCAG CTGAGTTCCA CCAATTATAA 2478 



62 
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m H 

GfHIcTT TCCTAATAGG CTTTTTTTTC CTAAraCC 



ATTAAATCCA TGAgHRtT TCCTAATAGG CTTTTTTTTC CTAAfRcAC CGGGTTAAAA 2538 
GTAAGAGACA GCTGAACCCT CGTGGAGCCA TTCATACAGG TCCCTATTTA AGGAACGGAA 2598 

< mm„ 2601 
5 TTC 

(2) INFORMATION FOR SEQ ID NO: 54: 

10 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 771 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

15 (ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 54: 

Met Glv Trp Leu Thr Arg He Val Cys Leu Phe Trp Gly Val Leu Leu 

20 i 5 io 15 

Thr Ala Arq Ala Asn Tyr Gin Asn Gly Lys Asn Asn Val Pro Arg Leu 
20 25 30 

25 Lys Leu Ser Tyr Lys Glu Met Leu Glu Ser Asn Asn Val He Thr Phe 
35 40 45 

Asn Gly Leu Ala Asn Ser Ser Ser Tyr His Thr Phe Leu Leu Asp Glu 
50 55 60 

30 

Glu Arg Ser Arg Leu Tyr Val Gly Ala Lys Asp His He Phe Ser Phe 
65 70 75 80 

Asp Leu Val Asn He Lys Asp Phe Gin Lys He Val Trp Pro Val Ser 
35 85 90 95 

Tyr Thr Arg Arg Asp Glu Cys Lys Trp Ala Gly Lys Asp He Leu Lys 
100 105 HO 

40 Glu Cys Ala Asn Phe He Lys Val Leu Lys Ala Tyr Asn Gin Thr His 

120 125 



115 



Leu Tyr Ala Cys Gly Thr Gly Ala Phe His Pro He Cys Thr Tyr He 
130 135 140 

45 

Glu He Gly His His Pro Glu Asp Asn He Phe Lys Leu Glu Asn Ser 
145 150 155 160 

His Phe Glu Asn Gly Arg Gly Ly b Ser Pro Tyr Asp Pro Lys Leu Leu 
50 165 170 175 

Thr Ala Ser Leu Leu He Asp Gly Glu Leu Tyr Ser Gly Thr Ala Ala 
180 185 190 

55 Asp Phe Met Gly Arg Asp Phe Ala He Phe Arg Thr Leu Gly His His 
. ^ 195 200 205 

His Pro He Arg Thr Glu Gin His Asp Ser Arg Trp Leu Asn Asp Pro 
210 215 220 

60 Lvs Phe He Ser Ala His Leu He Ser Glu Ser Asp Asn Pro Glu Asp 
225 230 235 240 

Asp Lys Val Tyr Phe Phe Phe Arg Glu Asn Ala He Asp Gly Glu His 
65 245 250 255 

Ser Gly Lys Ala Thr His Ala Arg He Gly Gin He Cys Lys Asn Asp 
260 265 270 
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Phe Gly Glylffs Arg Ser Leu Val Asn Lys Trp T hr Phe Leu Lys 

275 280 2S5 

5 ^ 116 CyS SSr Val Pr ° G1 * Pr ° Asn G1 y He Asp Thr His 

D 290 295 300 

Phe Asp Glu Leu Gin Asp Val Phe Leu Met Asn Phe Lys Asp Pro Lys 

10 Asn Pro Val Val Tyr Gly Val Phe Thr Thr Ser Ser Asn lie Phe Lys 

325 330 335 

Gly Ser Ala Val Cys Met Tyr Ser Met Ser Asp Val Arg Arg Val Phe 
15 340 345 350 

Leu Gly Pro Tyr Ala His Arg Asp Gly Pro Asn Tyr Gin Trp Val Pro 
355 360 365 

20 ^ ?™ Gly Arg Val Pr ° Tyr Pro Pro G1 y Thr Cys Pro Ser Lys 

^ 370 375 3ao 

Thr Phe Gly Gly Phe Asp Ser Thr Lys Asp Leu Pro Asp Asp Val He 
JBb 390 395 400 

25 Thr Phe Ala Arg Ser His Pro Ala Met Tyr Asn Pro Val Phe Pro Met 

405 410 415 

Asn Asn Arg Pro He Val He Lys Thr Asp Val Asn Tyr Gin Phe Thr 
30 420 425 430 

Gin He Val Val Asp Arg Val Asp Ala Glu Asp Gly Gin Tyr Asp Val 
4J5 440 445 

^ MSt IL e 116 Gly Thr Asp Val G1 y Thr Val Leu Lys Val Val Ser He 
450 455 460 

Pro Lys Glu Thr Trp Tyr Asp Leu Glu Glu Val Leu Leu Glu Glu Met 
465 470 475 480 

40 Thr Val Phe Arg Glu Pro Thr Ala He Ser Ala Met Glu Leu Ser Thr 

485 4 9 0 495 

Lys Gin Gin Gin Leu Tyr He Gly Ser Thr Ala Gly Val Ala Gin Leu 
45 500 505 510 

Pro Leu His Arg Cys Asp He Tyr Gly Lys Ala Cys Ala Glu Cys Cys 
515 520 525 

50 i\n *** ASP Pr ° Tyr CyS Ala Trp As P G1 y Ser Ala c ys Ser Arg 

u OJU 535 540 

Tyr Phe Pro Thr Ala Lys Arg Arg Thr Arg Arg Gin Asp He Arg Asn 

550 555 560 

55 Gly Asp Pro Leu Thr His Cys Ser Asp Leu His His Asp Asn His His 

565 570 575 

Gly His Ser Pro Glu Glu Arg He lie Tyr Gly Val Glu Asn Ser Ser 
580 585 5 90 

Thr Phe Leu Glu Cys Ser Pro Lys Ser Gin Arg Ala Leu Val Tyr Trp 
595 600 605 

65 llO ^ c 1U G1U Arg LyS Glu Glu Ile Ar ? Val Asp 

615 520 

Asp His Ile Ile Arg Thr Asp Gln Gly Leu Leu Leu Arg Ser Leu Gln 
" 5 630 635 640 



60 
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Gin Lys Asp Ser^y Asn Tyr Leu Cys His Ala Val^B His Gly Phe 

645 650 "=> 

lie Gin Thr Leu Leu Lys Val Thr Leu Glu Val He Asp Thr Glu His 
5 660 665 670 

Leu Glu Glu Leu Leu His Lys Asp Asp Asp Gly Asp Gly Ser Lys Thr 
675 680 685 

10 Lvs Glu Met Ser Asn Ser Met Thr Pro Ser Gin Lys Val Trp Tyr Arg 
690 695 700 



15 



30 



Asp Phe Met Gin Leu He Asn His Pro Asn Leu Asn Thr Met Asp Glu 

705 710 715 720 

Phe Cys Glu Gin Val Trp Lys Arg Asp Arg Lys Gin Arg Arg Gin Arg 

725 730 



Pro Gly His Thr Pro Gly Asn Ser Asn Lys Trp Lys His Leu Gin Glu 
20 740 745 750 

Asn Lys Lys Gly Arg Asn Arg Arg Thr His Glu Phe Glu Arg Ala Pro 
755 760 765 

25 Arg Ser Val 
770 



(2) INFORMATION FOR SEQ ID NO: 55: 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1332 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
35 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

( ix ) FEATURE : 
40 (A) NAME/KEY: CDS 

(B) LOCATION: 7.. 1329 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:55: 

45 GGAATA ATG ATG GTA TTA TTA CAT GCT GTA TAC TCT ATA GTC TTT GTA 
Met Met Val Leu Leu His Ala Val Tyr Ser He Val Phe Val 

r- in 



GAA ATA ATT TCT ACT TAC TTA TTA GAC GAC GTA TTA TAC ACG GGT GTT 
ctu* lie lie Ser Thr Tyr Leu Leu Asp Asp Val Leu Tyr Thr Gly Val 
65 70 75 

65 AAT GGG GCG GTA TAC ACA TTT TCA AAT AAT AAA CTA AAC AAA ACT GGT 
Asn Glv Ala Val Tyr Thr Phe Ser Asn Asn Lys Leu Asn Lys Thr Gly 
„i rc; 90 



48 



96 



GAT GTT ATA ATC ATA AAA GTA CAG AGG TAT ATC AAC GAT ATT CTA ACT 
50 Asp Val He He He Lys Val Gin Arg Tyr He Asn Asp He Leu Thr 
15 20 25 JU 

CTT GAC ATT TTT TAT TTA TTT AAA ATG ATA CCT TTG TTA TTT ATT TTA 144 
Leu Asp He Phe Tyr Leu Phe Lys Met He Pro Leu Leu Phe lie Leu 
55 35 40 45 

TTC TAT TTT GCT AAC GGT ATC GAA TGG CAT AAG TTT GAA ACG AGT GAA 192 
Phe Tyr Phe Ala Asn Gly He Glu Trp His Lys Phe Glu Thr Ser Glu 



240 



288 
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TTA ACT AAT^^ AAT TAT ATA ACA AC A TCT ATA^MF GTA GAG GAT GCG 336 

Leu Thr Asn Asn Asn Tyr lie Thr Thr Ser lie Lys Val Glu Asp Ala 
95 100 105 no 

5 GAT AAG GAT ACA TTA GTA TGC GGA ACC AAT AAC GGA AAT CCC AAA TGT 384 
Asp Lys Asp Thr Leu Val Cys Gly Thr Asn Asn Gly Asn Pro Lys Cys 
115 120 125 

TGG AAA ATA GAC GGT TCA GAC GAC CCA AAA CAT AGA GGT AGA GGA TAC 432 
Trp Lys lie Asp Gly Ser Asp Asp Pro Lys His Arg Gly Arg Gly Tyr 
130 135 140 



10 



20 



30 



40 



45 



50 



60 



65 



GCT CCT TAT CAA AAT AGC AAA GTA ACG ATA ATC AGT CAC AAC GGA TGT 

His 
155 



- - — — - nj.n ni.v A rtAV- J.\j 1 

Ala Pro Tyr Gin Asn Ser Lys Val Thr He He Ser His Asn Gly Cys 
ID 145 150 



_ - — -ww«. a nnv v-.nri 

He Lys Gin Ser Phe Ser Thr Ser Lys Leu Glu Gly Tyr Thr Lys Gin 
JJ 305 - 



480 



GTA CTA TCT GAC ATA AAC ATA TCA AAA GAA GGA ATT AAA CGA TGG AGA 528 
Val Leu Ser Asp He Asn He Ser Lys Glu Gly He Lys Arg Trp Arg 
160 165 170 

AGA TTT GAC GGA CCA TGT GGT TAT GAT TTA TAC ACG GCG GAT AAC GTA 576 
Arg Phe Asp Gly Pro Cys Gly Tyr Asp Leu Tyr Thr Ala Asp Asn Val 
175 180 185 190 

25 ATT CCA AAA GAT GGT TTA CGA GGA GCA TTC GTC GAT AAA GAT GGT ACT 624 
He Pro Lys Asp Gly Leu Arg Gly Ala Phe Val Asp Lys Asp Gly Thr 
195 200 205 



TAT GAC AAA GTT TAC ATT CTT TTC ACT GAT ACT ATC GGC TCA AAG AGA 672 
Tyr Asp Lys Val Tyr He Leu Phe Thr Asp Thr He Gly Ser Lys Arg 
210 215 220 



ATT GTC AAA ATT CCG TAT ATA GCA CAA ATG TGC CTA AAC GAC GAA GGT 720 
He Val Lys He Pro Tyr He Ala Gin Met Cys Leu Asn Asp Glu Gly 
225 230 235 



GGT CCA TCA TCA TTG TCT AGT CAT AGA TGG TCG ACG TTT CTC AAA GTC 768 
Gly Pro Ser Ser Leu Ser Ser His Arg Trp Ser Thr Phe Leu Lys Val 
240 245 250 

GAA TTA GAA TGT GAT ATC GAC GGA AGA AGT TAT AGA CAA ATT ATT CAT 816 
Glu Leu Glu Cys Asp He Asp Gly Arg Ser Tyr Arg Gin He He His 
255 260 265 270 

TCT AGA ACT ATA AAA ACA GAT AAT GAT ACG ATA CTA TAT GTA TTC TTC 864 
Ser Arg Thr He Lys Thr Asp Asn Asp Thr He Leu Tyr Val Phe Phe 
275 280 285 

GAT AGT CCT TAT TCC AAG TCC GCA TTA TGT ACC TAT TCT ATG AAT ACC 912 
Asp Ser Pro Tyr Ser Lys Ser Ala Leu Cys Thr Tyr Ser Met Asn Thr 
290 295 300 



ATT AAA CAA TCT TTT TCT ACG TCA AAA TTG GAA GGA TAT ACA AAG CAA 9 60 
~* " * Ser Lys Leu Glu Gly Tyr 
310 315 



TTG CCG TCG CCA GCC TCT GGT ATA TGT CTA CCA GCT GGA AAA GTT GTT 1008 
Leu Pro Ser Pro Ala Ser Gly He Cys Leu Pro Ala Gly Lys Val Val 
320 325 330 

CCA CAT ACC ACG TTT GAA GTC ATA GAA AAA TAT AAT GTA CTA GAT GAT 1056 
Pro His Thr Thr Phe Glu Val He Glu Lys Tyr Asn Val Leu Asp Asp 
335 340 345 350 

ATT ATA AAG CCT TTA TCT AAC CAA CCT ATC TTC GAA GGA CCG TCT GGT 1104 
He He Lys Pro Leu Ser Asn Gin Pro He Phe Glu Gly Pro Ser Gly 
355 360 365 
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GTT 
Val 


AAA 
Lys 


TGG 
Trp 


TTC* 

Phe 
370 


m 

Asp 


ATA 
He 


AAG 
Lys 


GAG 
Glu 


AAG 
Lys 
~j / -> 


GAA 
Glu 


AAT 
Asn 


GAA 
Glu 


His 


CGG 
Arg 
380 


GAA 
Glu 


TAT 
Tyr 


5 


AGA 
Arg 


ATA 
He 


TAG 
Tyr 
385 


TTC 
Phe 


ATA 
He 


AAA 
Lys 


GAA 
Glu 


AAT 
Asn 
o .? u 


TCT 
Ser 


ATA 
He 


TAT 
Tyr 


TCG 
Ser 


TTC 
Phe 
395 


GAT 
Asp 


ACA 
Thr 


AAA 
Lys 


10 


TCT 
Ser 


AAA 
Lys 


CAA 
Gin 


ACT 
Thr 


CGT 
Arg 


AGC 
Ser 


TCG 
Ser 
405 


CAA 
Gin 


GTC 
Val 


GAT 
Asp 


GCG 
Ala 


CGA 
Arg 
410 


CTA 
Leu 


TTT 
Phe 


TCA 
Ser 


GTA 
Val 


15 


ATG 
Met 
415 


GTA 
Val 


ACT 
Thr 


TCG 
Ser 


AAA 
Lys 


CCG 
Pro 
420 


TTA 
Leu 


TTT 
Phe 


ATA 
He 


GCA 
Ala 


GAT 
Asp 
425 


ATA 
He 


GGG 
Gly 


ATA 
He 


GGA 
Gly 


GTA 
Val 

H J \J 




GGA 
Gly 


ATG 
Met 


CCA 
Pro 


CAA 
Gin 


ATG 
Met 
435 


AAA 
Lys 


AAA 
Lys 


ATA 
He 


CTT 
Leu 


AAA 
Lys 
440 


ATG 
Met 


TAA 
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(2) INFORMATION FOR SEQ ID NO: 56: 



(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 441 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



30 



35 



(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 56: 

Met Met Val Leu Leu His Ala Val Tyr Ser He Val Phe Val Asp Val 

10 15 



1 5 



He He He Lys Val Gin Arg Tyr He Asn Asp He Leu Thr Leu Asp 
20 25 30 

He Phe Tyr Leu Phe Lys Met He Pro Leu Leu Phe He Leu Phe Tyr 
40 35 40 45 

Phe Ala Asn Gly lie Glu Trp His Lys Phe Glu Thr Ser Glu Glu He 
SO 55 SO 

45 He Ser Thr Tyr Leu Leu Asp Asp Val Leu Tyr Thr Gly Val Asn Gly 
65 70 75 80 

Ala Val Tyr Thr Phe Ser Asn Asn Lys Leu Asn Lys Thr Gly Leu Thr 
85 90 95 

Asn Asn Asn Tyr He Thr Thr Ser He Lys Val Glu Asp Ala Asp Lys 
100 105 HO 

Asd Thr Leu Val Cys Gly Thr Asn Asn Gly Asn Pro Lys Cys Trp Lys 
55 H5 120 125 

He Asp Gly Ser Asp Asp Pro Lys His Arg Gly Arg Gly Tyr Ala Pro 
130 135 140 

60 Tyr Gin Asn Ser Lys Val Thr He He Ser His Asn Gly Cys Val Leu 
- -- 150 155 J- ou 



50 



145 



Ser Asp He Asn He Ser Lys Glu Gly He Lys Arg Trp Arg Arg Phe 
165 170 175 

65 Asp Gly Pro Cys Gly Tyr Asp Leu Tyr Thr Ala Asp Asn Val He Pro 
180 I 85 19 
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# ft 

Lys Asp Gly^eu Arg Gly Ala Phe Val Asp Lys^^ Gly Thr Tyr Asp 

195 200 205 

Lys Val Tyr He Leu Phe Thr Asp Thr He Gly Ser Lys Arg He Val 
5 210 215 220 

Lys He Pro Tyr He Ala Gin Met Cys Leu Asn Asp Glu Gly Gly Pro 
225 230 235 240 

10 Ser Ser Leu Ser Ser His Arg Trp Ser Thr Phe Leu Lys Val Glu Leu 

245 250 255 

Glu Cys Asp He Asp Gly Arg Ser Tyr Arg Gin He He His Ser Arg 
15 "0 265 270 

Thr He Lys Thr Asp Asn Asp Thr He Leu Tyr Val Phe Phe Asp Ser 
275 280 285 

Pro Tyr Ser Lys Ser Ala Leu Cys Thr Tyr Ser Met Asn Thr He Lys 
20 290 295 300 

Gin Ser Phe Ser Thr Ser Lys Leu Glu Gly Tyr Thr Lys Gin Leu Pro 
305 310 315 320 

25 Ser Pro Ala Ser Gly He Cys Leu Pro Ala Gly Lys Val Val Pro His 

325 330 335 

Thr Thr Phe Glu Val He Glu Lys Tyr Asn Val Leu Asp Asp He He 
340 345 350 

Lys Pro Leu Ser Asn Gin Pro He Phe Glu Gly Pro Ser Gly Val Lvs 
355 360 365 

Trp Phe Asp He Lys Glu Lys Glu Asn Glu His Arg Glu Tyr Arg He 
35 370 375 380 

Tyr Phe He Lys Glu Asn Ser He Tyr Ser Phe Asp Thr Lys Ser Lys 
385 390 395 400 

40 Gin Thr Arg Ser Ser Gin Val Asp Ala Arg Leu Phe Ser Val Met Val 

405 410 415 



30 



45 



Thr Ser Lys Pro Leu Phe He Ala Asp He Gly He Gly Val Gly Met 
420 425 430 

Pro Gin Met Lys Lys He Leu Lys Met 
435 440 



50 (2) INFORMATION FOR SEQ ID NO: 57: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2854 base pairs 

(B) TYPE: nucleic acid 
55 (C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

60 { ix ) FEATURE : 

(A) NAME/KEY: CDS 

(B) LOCATION: 451.. 2640 

„ (xi) SEQUENCE DESCRIPTION: SEQ ID NO:57: 

65 

ATTCCACCTC CCGCTGACCG CCTACGCCGC GACGATCTTT CCTCTCGCCA GGCGAAAACT 60 
ACGACGTGTC AACAACATTT TTGTTTTTTC TGCTTCCGTG TTTTCATGTT CCGTGAAACC 120 
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10 



30 



50 



.^j^TCT TCC gxTTCCC AGTGTTTGTT TTC^^^: 



GCTTCTCGCA TTAO^EtCT TCCGTTTCCC AGTGTTTGTT TTcfScGTTT CTTTCATCGT 180 

GGATGTTTTG TTTTGGTGTA GCGAGTGACG AGCTTATGTC ATTAAACGTA CATCCAATCT 240 

GTCGGTATAT TGGTGTGTGA TATTTTACTA TTATATATTT AGCCATCACT TGAAAGCCGT 300 

GAAAAATTTT TGAAAGTGGA GAGGAAAAAG AAAAGG CGC A GAAGGCTTTT TAAGCTTCAT 360 

GGATATGTGC TCTACGCTTC AACTACTGTC GCAGAATCAT CTTCCGGGAA AGGAAATTTC 420 

GCCTGAAATG GTGCCGCGGC CGCACTGAAC ATG CGG GCG GCG CTG GTG GCC GTC 474 

Met Arg Ala Ala Leu Val Ala Val 
1 5 



15 GCG GCG CTG CTT TGG GTG GCG CTG CAC GCC GCC GCA TGG GTC AAC GAC 522 

Ala Ala Leu Leu Trp Val Ala Leu His Ala Ala Ala Trp Val Asn Asp 

10 15 20 

GTC AGC CCC AAG ATG TAC GTC CAG TTC GGT GAG GAA CGG GTG CAA CGC 570 

20 Val Ser Pro Lys Met Tyr Val Gin Phe Gly Glu Glu Arg Val Gin Arg 

25 30 35 40 

TTC CTG GGC AAT GAA TCG CAC AAA GAC CAC TTC AAG CTG CTG GAG AAG 618 

Phe Leu Gly Asn Glu Ser His Lys Asp His Phe Lys Leu Leu Glu Lys 

25 45 50 55 

GAC CAC AAC TCG CTC CTC GTA GGA GCT AGG AAC ATC GTC TAC AAT ATC 666 

Asp His Asn Ser Leu Leu Val Gly Ala Arg Asn lie Val Tyr Asn lie 

60 65 70 



AGC CTT CGA GAC CTC ACA GAA TTC ACC GAG CAG AGG ATC GAG TGG CAC 714 
Ser Leu Arg Asp Leu Thr Glu Phe Thr Glu Gin Arg He Glu Trp His 
75 80 85 



35 TCG TCA GGT GCC CAT CGC GAG CTC TGC TAC CTC AAG GGG AAG TCA GAG 762 

Ser Ser Gly Ala His Arg Glu Leu Cys Tyr Leu Lys Gly Lys Ser Glu 
90 95 100 

GAC GAC TGC CAG AAC TAC ATC CGA GTC CTG GCG AAA ATT GAC GAT GAC 810 

40 Asp Asp Cys Gin Asn Tyr He Arg Val Leu Ala Lys He Asp Asp Asp 
105 HO 115 120 

CGC GTA CTC ATC TGC GGT ACG AAC GCC TAT AAG CCA CTA TGT CGG CAC 858 

Arg Val Leu He Cys Gly Thr Asn Ala Tyr Lys Pro Leu Cys Arg His 

45 125 130 135 

TAC GCC CTC AAG GAT GGA GAT TAT GTT GTA GAG AAA GAA TAT GAG GGA 906 

Tyr Ala Leu Lys Asp Gly Asp Tyr Val Val Glu Lys Glu Tyr Glu Gly 
140 145 150 



AGA GGA TTG TGC CCA TTT GAC CCT GAC CAC AAC AGC ACT GCA ATA TAC 954 
Arg Gly Leu Cys Pro Phe Asp Pro Asp His Asn Ser Thr Ala He Tyr 
155 160 165 



55 AGT GAG GGA CAA TTG TAC TCA GCA ACA GTG GCA GAC TTC TCT GGA ACT 1002 
Ser Glu Gly Gin Leu Tyr Ser Ala Thr Val Ala Asp Phe Ser Gly Thr 
170 175 180 

GAC CCT CTC ATA TAC CGC GGC CCT CTA AGA ACA GAG AGA TCT GAC CTC 1050 
60 Asp Pro Leu He Tyr Arg Gly Pro Leu Arg Thr Glu Arg Ser Asp Leu 
185 190 195 200 

AAA CAA TTA AAT GCT CCT AAC TTT GTC AAC ACA ATG GAG TAC AAT GAT 1098 
Lys Gin Leu Asn Ala Pro Asn Phe Val Asn Thr Met Glu Tyr Asn Asp 
65 205 210 215 

TTT ATA TTC TTC TTC TTC CGA GAG ACT GCT GTT GAG TAC ATC AAC TGC 1146 



69 



20 



25 
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Phe He Ph<SR=> Phe Phe Arg Glu Thr Ala Val^PI Tyr He Asn Cys 

220 225 230 

5 fCS a°I rT C I AT ^ CA AGA GTT GCC AGA GTC TGT AAA °AT GAC AAG 1194 

G1 y L ^ s Ala He Tyr Ser Arg Val Ala Arg Val Cys Lys His Asp Lys 
235 240 245 

GGC G ? C 2 CT CAT CAG GGT GGT °AC AGA TGG ACT TCT TTT TTG AAA TCA 1242 
in Y G cX Pr ° Hi8 Gln Gly Gly Asp Arg Trp Thr Ser Phe Leu Lys Ser 
1U 250 255 260 

£™ ?l G l GT TCC GTC CCT GGA G AT TAT CCA TTT TAC TTC AAT GAA 1290 

Arg Leu Asn Cys Ser Val Pro Gly Asp Tyr Pro Phe Tyr Phe Asn Glu 

15 265 270 275 280 

ATT CAG TCA ACA AGT GAC ATC ATT GAA GGA AAT TAT GGT GGT CAA GTG 1338 
He Gin Ser Thr Ser Asp He He Glu Gly Asn Tyr Gly Gly Gin Val 
285 290 295 

GAG AAA CTC ATC TAC GGT GTC TTC ACG ACA CCA GTG AAC TCT ATT GGT 1386 
Glu Lys Leu He Tyr Gly Val Phe Thr Thr Pro Val Asn Ser He Glv 
30 ° 305 310 

G?v 1*1 A?= v T ? £ GT GCC 1X0 AGT ATG AAG TCA ATA CTT GAG TCA TTT 1434 
Gly Ser Ala Val Cys Ala Phe Ser Met Lys Ser He Leu Glu Ser Phe 

315 320 325 

GAT GGT CCA TTT AAA GAG CAG GAA ACG ATG AAC TCA AAC TGG TTG GCA 1482 

™ P °\Z Pr ° PhS LyS G1U Gln Glu Thr Met Asn Ser Asn Trp Leu Ala 
JU 330 335 340 

v=? o CA c GC CTT GTG CCA GAA CCA AGG CCT GGA CAA TGT GTG AAT 1530 

Val Pro Ser Leu Lys Val Pro Glu Pro Arg Pro Gly Gln Cys Val Asn 

35 345 350 3 55 360 

GAC AGT CGT ACA CTT CCT GAT GTG TCT GTC AAT TTT GTA AAG TCA CAT 15 78 
Asp Ser Arg Thr Leu Pro Asp Val Ser Val Asn Phe Val Lys Ser His 
365 370 375 

40 ACA CTG ATG GAT GAG GCC GTG CCA GCA TTT TTT ACT CGG CCA ATT CTC 1626 
Thr Leu Met Asp Glu Ala Val Pro Ala Phe Phe Thr Arg Pro lie 
380 385 390 

ill a™ tT C AGC t TTA CAG TAC AGA TTT ACA AAA ATA GCT GTT GAT CAA 1674 
He Arg He Ser Leu Gln Tyr Arg Phe Thr Lys He Ala Val Asp Gln 
39 5 400 405 

CAA GTC CGA ACA CCA GAT GGG AAA GCG TAT GAT GTC CTG TTT ATA GGA 172? 

50 410 Ar9 Thr Pr ° Asp a,Z Lys Ala Tyr Asp Val 

Leu Phe lie Gly 

ACT GAT GAT GGC AAA GTG ATA AAA GCT TTG AAC TCT GCC TCC TTT GAT 1770 
Thr Asp Asp Gly Lys Val He Lys Ala Leu Asn Ser Ala Ser Phe Asp 
55 425 430 435 440 

TCA TCT GAT ACT GTA GAT AGT GTT GTA ATA GAA GAA CTG CAA GTG TTG 1818 
Ser ser Asp Thr Val Asp Ser Val Val He Glu Glu Leu Gln Val Leu 
445 450 455 



45 



60 



65 



CCA CCT GGA GTA CCT GTT AAG AAC CTG TAT GTG GTG CGA ATG GAT GGG 1866 
Pro Pro Gly Val Pro Val Lys Asn Leu Tyr Val Val Arg Set Asp * 
460 465 470 * 

GAT GAT AGC AAG CTG GTG GTT GTG TCT GAT GAT GAG ATT CTG GCA ATT 1914 
Asp Asp ser Lys Leu Val Val Val Ser Asp Asp Glu He Leu Ala III 
475 480 485 

AAG CTT CAT CGT TGT GGC TCA GAT AAA ATA ACA AAT TGT CGA GAA TGT 1962 



70 
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g^jl Gly Ser Asp Lys He Thr Asn Arg Glu Cys 



40 



Lya Leu His Arg , 

490 495 500 

GTG TCC TTG CAA GAT CCT TAG TGT GCA TGG GAG AAT GTA GAA TTA AAA 2010 
Val Ser Leu Gin Asp Pro Tyr Cys Ala Trp Asp — v.i r.i u t.u Lvs 
505 510 515 



GTG TCC TTG CAA GAT Cti iai- _ T „„ 

Val Ser Leu Gin Asp Pro Tyr Cys Ala Trp Asp Asn Val Glu Leu Lys 
505 510 515 52U 

TGT ACA GCT GTA GGT TCA CCA GAC TGG ACT GCT GGA AAA AGA CGC TTT 

10 Cys Thr Ala val Gly Ser Pro Asp Trp Ser Ala Gly Lys Arg Arg Phe 



65 



525 



60 



2058 



ATT CAG AAC ATT TCA CTC GGT GAA CAT AAA GCT TGT GGT GGA CGT CCA 2106 
lie Gin Asn He Ser Leu Gly Glu His Lys Ala Cys Gly Gly Arg Pro 
15 540 545 550 



CAA ACA GAA ATC GTT GCT TCT CCT GTA CCA ACT CAG CCG ACG ACA AAA 2154 
Gin Jni Glu lie Val Ala Ser Pro Val Pro Thr Gin Pro Thr Thr Lys 
555 560 565 

TCT AGT GGC GAT CCC GTT CAT TCA ATC CAC CAG GCT GAA TTT GAA CCT 2202 
Ser Ser Gly Asp Pro Val His Ser He His Gin Ala Glu Phe Glu Pro 



570 575 580 

25 GAA ATT GAC AAC GAG ATT GTT ATT GGA GTA GAT GAC AGC AAC GTC ATT 2250 

Glu He Asp Asn Glu He Val He Gly Val Asp Asp Ser Asn Val He 
5 85 590 595 600 

CCT AAT ACC CTG GCT GAA ATA AAT CAT GCA GGT TCA AAG CTG CCT TCC 2298 

30 Pro Asn Thr Leu Ala Glu He Asn His Ala Gly Ser Lys Leu Pro Ser 

605 610 bJ - = 



Tyr He 
730 



2346 



TCC CAG GAA AAG TTG CCT ATT TAT ACA GCG GAG ACT CTG ACT ATT GCT 
Ser Gin Glu Lys Leu Pro He Tyr Thr Ala Glu Thr Leu Thr He Ala 
620 625 630 

ATA GTT ACA TCA TGC CTT GGA GCT CTA GTT GTT GGC TTC ATC TCT GGA 2394 
lie Val Thr Ser Cys Leu Gly Ala Leu Val Val Gly Phe He Ser Gly 

640 645 



TTT CTT TTT TCT CGG CGA TGC AGG GGA GAG GAT TAC ACA GAC ATG CCT 2442 
Phe Leu Phe Ser Arg Arg Cys Arg Gly Glu Asp Tyr Thr Asp Met Pro 
650 655 660 



2490 



45 TTT CCA GAT CAA CGC CAT CAG CTA AAT AGG CTC ACT GAG GCT GGT CTG 
Phe Pro Asp Gin Arg His Gin Leu Asn Arg Leu Thr Glu Ala Gly Leu 
665 670 675 680 

AAT GCA GAC TCA CCC TAT CTT CCA CCC TGT GCC AAT AAC AAG GCA GCC 2538 
50 Asn Ala Asp Ser Pro Tyr Leu Pro Pro Cys Ala Asn Asn Lys Ala Ala 

685 690 695 

ATA AAT CTT GTG CTC AAT GTC CCA CCA AAG AAT GCA AAT GGA AAA AAT 2586 
lie Asn Leu Val Leu Asn Val Pro Pro Lys Asn Ala Asn Gly Lys Asn 
55 700 705 710 

GCC AAC TCT TCA GCT GAA AAC AAA CCA ATA CAG AAA GTA AAA AAG ACA 2 634 
aS £s~n HI Ser Ala Glu Asn Lys Pro He Gin Lys Val Lys Lys Thr 
715 720 725 

TAC ATT TAGCAGAAAT CTTTGGTATC TGTTTTGGTG CAGACCCATG CCACTAGAGT 2 690 



AACCAAGACT CTATTGAGAA ATGTCCTCAA GAAAGTTAAA AAGATGTAGA CTTCTGTAAT 27 50 
CGAGAGCACC ACTTTCCATA GTAATACAGA ACAATGTGAA ATAAATACTA CAGAAGAAGT 2810 



71 
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CTTTGTTACA CHKAAAAGTG TATAGTGATC TGTGATCAGT 



(2) INFORMATION FOR SEQ ID NO: 58; 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 730 amino acids 

(B) TYPE: amino acid 
1Q (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:58: 

15 Met Arg Ala Ala Leu Val Ala Val Ala Ala Leu Leu Trp Val Ala Leu 
15 10 15 

His Ala Ala Ala Trp Val Asn Asp Val Ser Pro Lys Met Tyr Val Gin 
20 25 30 

Phe Gly Glu Glu Arg Val Gin Arg Phe Leu Gly Asn Glu Ser His Lys 
35 40 45 

Asp His Phe Lys Leu Leu Glu Lys Asp His Asn Ser Leu Leu Val Gly 
ZD 50 55 60 

Ala Arg Asn lie Val Tyr Asn He Ser Leu Arg Asp Leu Thr Glu Phe 
65 7 0 75 80 

30 Thr Glu Gin Arg He Glu Trp His Ser Ser Gly Ala His Arg Glu Leu 

85 90 95 



20 



35 



Cys Tyr Leu Lys Gly Lys Ser Glu Asp Asp Cys Gin Asn Tyr He Arq 
100 105 no 

Val Leu Ala Lys He Asp Asp Asp Arg Val Leu He Cvs Gly Thr Asn 
115 120 125 

40 130 LyS Pr ° LeU CyS Jff His Tyr AIa Leu Lys Asp Gly Asp Tyr 

Val Val Glu Lys Glu Tyr Glu Gly Arg Gly Leu Cys Pro Phe Asp Pro 
145 I 50 155 160 

45 Asp His Asn Ser Thr Ala He Tyr Ser Glu Gly Gin Leu Tyr Ser Ala 

165 170 i7 5 

Thr Val Ala Asp Phe Ser Gly Thr Asp Pro Leu He Tyr Arg Gly Pro 
1^0 i 8 5 ig g 

Leu Arg Thr Glu Arg Ser Asp Leu Lys Gin Leu Asn Ala Pro Asn Phe 
195 200 205 

^ Val n?2 Thr MSt Glu Tyr Asn Asp Phe Ile phe phe phe Phe Arg Glu 
55 210 215 220 

Thr Ala Val Glu Tyr Ile Asn Cys Gly Lys Ala Ile Tyr Ser Arg Val 
225 230 235 240 

60 Ala Arg Val Cys Lys His Asp Lys Gly Gly Pro His Gin Gly Gly Asp 

245 250 255 

Arg Trp Thr Ser Phe Leu Lys Ser Arg Leu Asn Cys Ser Val Pro Glv 
65 260 265 270 

Asp Tyr Pro Phe Tyr Phe Asn Glu Ile Gin Ser Thr Ser Asp He He 
275 280 285 



50 



2854 



72 
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Glu Gly Asn Tyr^^ Gly Gin Val* Glu Lys Leu Ile^^ Gly Val Phe 

290 295 300 

Thr Thr Pro Val Asn Ser He Gly Gly Ser Ala Val Cys Ala Phe Ser 
5 305 310 315 320 

Met Lvs Ser He Leu Glu Ser Phe Asp Gly Pro Phe Lys Glu Gin Glu 
325 330 335 

Thr Met Asn Ser Asn Trp Leu Ala Val Pro Ser Leu Lys Val Pro Glu 
340 345 350 

Pro Arg Pro Gly Gin Cys Val Asn Asp Ser Arg Thr Leu Pro Asp Val 
15 355 360 365 

Ser Val Asn Phe Val Lys Ser His Thr Leu Met Asp Glu Ala Val Pro 
370 375 380 

20 Ala Phe Phe Thr Arg Pro He Leu He Arg He Ser Leu Gin Tyr Arg 
385 390 395 400 

Phe Thr Lys He Ala Val Asp Gin Gin Val Arg Thr Pro Asp Gly Lys 
405 410 415 

Ala Tyr Asp Val Leu Phe He Gly Thr Asp Asp Gly Lys Val He Lys 
420 425 430 

Ala Leu Asn Ser Ala Ser Phe Asp Ser Ser Asp Thr Val Asp Ser Val 
30 435 440 445 

Val He Glu Glu Leu Gin Val Leu Pro Pro Gly Val Pro Val Lys Asn 
450 455 460 

35 Leu Tyr Val Val Arg Met Asp Gly Asp Asp Ser Lys Leu Val Val Val 
465 470 475 480 

Ser Asp Asp Glu He Leu Ala He Lys Leu His Arg Cys Gly Ser Asp 
485 490 495 

40 

Lys He Thr Asn Cys Arg Glu Cys Val Ser Leu Gin Asp Pro Tyr Cys 
500 505 510 

Ala Trp Asp Asn Val Glu Leu Lys Cys Thr Ala Val Gly Ser Pro Asp 
45 515 520 525 

Trp Ser Ala Gly Lys Arg Arg Phe He Gin Asn He Ser Leu Gly Glu 
530 535 540 

50 His Lys Ala Cys Gly Gly Arg Pro Gin Thr Glu He Val Ala Ser Pro 
545 550 555 560 



25 



55 



Val Pro Thr Gin Pro Thr Thr Lys Ser Ser Gly Asp Pro Val His Ser 
565 570 575 

He His Gin Ala Glu Phe Glu Pro Glu He Asp Asn Glu He Val He 
580 585 590 

Glv Val Asp Asp Ser Asn Val He Pro Asn Thr Leu Ala Glu He Asn 
60 595 600 605 

His Ala Gly Ser Lys Leu Pro Ser Ser Gin Glu Lys Leu Pro He Tyr 
610 615 620 

65 Thr Ala Glu Thr Leu Thr He Ala He Val Thr Ser Cys Leu Gly Ala 
625 630 635 640 
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Leu Val Val^ffy Phe He Ser Gly Phe Leu Phe Hi Arg Arg Cys Arg 

645 650 655 

Gly Glu Asp Tyr Thr Asp Met Pro Phe Pro Asp Gin Arg His Gin Leu 
J 660 665 670 

Asn Arg Leu Thr Glu Ala Gly Leu Asn Ala Asp Ser Pro Tyr Leu Pro 
675 680 685 

10 Pro Cys Ala Asn Asn Lys Ala Ala He Asn Leu Val Leu Asn Val Pro 
690 695 700 

Pro Lys Asn Ala Asn Gly Lys Asn Ala Asn Ser Ser Ala Glu Asn Lvs 
15 705 710 715 720 

Pro He Gin Lys Val Lys Lys Thr Tyr He 
725 730 



20 (2) INFORMATION FOR SEQ ID NO: 59: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3560 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
30 ( ix ) FEATURE : 



35 



40 



(A) NAME/KEY: CDS 

(B) LOCATION: 1..1953 

(Xi) SEQUENCE DESCRIPTION: SEQ ID NO: 59: 

GAG GAT GAT TGT CAG AAT TAC ATC CGC ATC ATG GTG GTG CCA TCG CCG 
Glu Asp Asp cys Gin Asn Tyr He Arg lie Met Val Val Pro Ser Pro 
1 5 io 15 

GGT CGC CTT TTC GTT TGT GGC ACC AAC TCG TTC CGG CCC ATG TGC AAC 
Gly Arg Leu Phe v*l Cys Gly Thr Asn Ser Phe Arg Pro Met Cys Js"n 



60 



65 



48 



96 



45 ?S Sr Ho Til t A ° c GC ^ TAC ACG CTG GAG GCC ACG AAC 144 

« Thr Tyr lie He Ser Asp Ser Asn Tyr Thr Leu Glu Ala Thr Lys Asn 

35 40 45 

gS Gin A CG n° C S CC ~ AC GAT CCA CGT CAC TCC ACC *CT GTG 192 

Gly Gin Ala Val Cys Pro Tyr Asp Pro Arg His Asn Ser Thr Ser Val 

^ b0 55 60 

CTG GCC GAC AAC GAA CTG TAT TCC GGT ACC GTG GCG GAT TTC ACT GGC 240 
Leu Ala Asp Asn Glu Leu Tyr Ser Gly Thr Val Ala Asp Phe Ser Gly 
55 70 75 80 

AGC GAT CCG ATT ATC TAC CGG GAG CCC CTG CAG ACC GAG CAG TAC GAT 288 
Ser Asp Pro He lie Tyr Arg Glu Pro Leu Gin Thr Glu Gin Tyr Asp 
85 90 95 

tor r°o A t CTC AAC GCA CCG ** C TTT GTG AGC TCA T TT ACG CAG GGC 33 6 

Ser Leu Ser Leu Asn Ala Pro Asn Phe Val Ser Ser Phe Thr Gin Gly 

100 105 no 

GAC TTT GTC TAT TTC TTC TTT CGG GAA ACC GCC GTT GAG TTT ATC AAC 384 
Asp Phe Val Tyr Phe Phe Phe Arg Glu Thr Ala Val Glu Phe lie ^n 
115 120 125 

TGT GGC AAG GCG ATT TAT TCG CGC GTT GCC CGC GTC TGC AAA TGG GAC 432 



74 
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Cys Gly Lye Ala t^Tyr Ser Arg Val Ala Arg Val C^-ys Trp Asp 

130 " 135 140 

AAA GGT GGC CCG CAT CGA TTC CGC AAC CGC TGG ACA TCC TTC CTC AAG 
5 £y1 111 Gly Pro His Arg Phe Arg Asn Arg Trp Thr Ser Phe Leu Lys 
145 150 1" 

TCC CGC CTC AAC TGC TCC ATT CCC GGC GAT TAT CCT TTC TAC TTT AAT 528 
sir Sg tlu kin Cys Ser He Pro Gly Asp Tyr Pro Phe Tyr Phe Asn 
10 165 170 I 75 



GAA ATC CAA TCT GCC AGC AAT CTG GTG GAG GGA CAG TAT GGC TCG ATG 
Glu S S£ Ser Ala Ser Asn Leu val Glu Gly Gin Tyr Gly Ser Met 



195 



CCC GGC TCA GCG GTT TGT GCC TTT GCC CTC CAG GAC ATT GCC GAT ACG 
5ro G?y sir Ala Val Cys Ala Phe Ala Leu Gin Asp He Ala Asp Thr 
210 215 220 

TTT GAG GGT CAG TTC AAG GAG CAG ACT GGC ATC AAC TCC AAC TGG CTG 
Phe Glu Gly Gin Phe Lys Glu Gin Thr Gly lie Asn Ser Asn Trp Leu 
225 230 235 

CCA GTG AAC AAC GCC AAG GTA CCC GAT CCT CGA CCC GGT TCC TGT CAC 
Pro" Val Asn Asn Ala Lys Val Pro Asp Pro Arg Pro Gly Ser Cys Hi. 
30 245 250 255 



20 



25 



AAC GAT TCG AGA GCG CTT CCG GAT CCC ACA CTG AAC TTC ATC AAA ACA 
j£n Asp ser Arg Ala Leu Pro Asp Pro Thr Leu Asn Phe lie Lys Thr 

265 * /u 



480 



576 



180 

15 AGC TCG AAA CTG ATC TAC GGA GTC TTC AAC ACG CCG AGC AAC TCA ATT 624 
S« Ser J£ 22 lie Tyr Gly Val Phe Asn Thr Pro Ser Asn Ser He 
- - 200 205 



672 



720 



768 



816 



260 

35 CAT TCG CTA ATG GAC GAG AAT GTG CCG GCA TTT TTC ACT CAA CCG ATT 864 
His ser Leu Met Asp Glu Asn Val Pro Ala Phe Phe Ser Gin Pro He 
275 280 285 

40 TTG GTC CGG ACG AGC ACA ATA TAC CGC TTC ACT CAA ATC GCC GTA GAT 912 
Leu Val Arg Thr Ser Thr lie Tyr Arg Phe Thr Gin He Ala Val Asp 
290 295 300 

GCG CAG ATT AAA ACT CCT GGC GGC AAG ACA TAT GAT GTT ATC TTT GTG 960 
45 Ala Gin He Lys Thr Pro Gly Gly Lys Thr Tyr Asp Val He Phe Val 
305 310 315 

GGC ACA GAT CAT GGA AAG ATT ATT AAG TCA GTG AAT GCT GAA TCT GCC 1008 
Gly Thr Asp His Gly Lys He He Lys Ser Val Asn Ala Glu Ser Ala 
50 325 330 " 3 

GAT TCA GCG GAT AAA GTC ACC TCC GTA GTC ATC GAG GAG ATC GAT GTC 105 6 
tel Ser Ala Asp Lys Val Thr Ser Val Val He Glu Glu He Asp Val 
340 345 

55 CTG ACC AAG AGT GAA CCC ATA CGC AAT CTG GAG ATA GTC AGA ACC ATG 1104 
Su Thr Lys Ser Glu Pro He Arg Asn Leu Glu He Val Arg Thr Met 
355 360 365 

60 CAG TAC GAT CAA CCC AAA GAT GGC AGC TAC GAC GAT GGT AAA TTA ATC 
Sn Tyr Asp Gin Pro Lys Asp Gly Ser Tyr Asp Asp Gly Lys Leu He 
370 375 380 

ATT GTG ACG GAC AGT CAG GTG GTA GCC ATA CAA TTG CAT CGT TGT CAC 
65 He Va? Thr Asp Ser Gin Val Val Ala He Gin Leu Hi. Arg Cys His 
385 390 395 40U 



1152 



1200 



AAT GAC AAA ATC ACC AGC TGC AGC GAG TGC GTC GCA TTG CAG GAT CCG 



1248 



75 
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Asn Asp Lys'We Thr Ser Cys Ser Glu Cys Val W Leu Gin Asp Pro 
405 410 41 s 

TAC TGC GCC TGG GAC AAA ATC GCT GGC AAG TGC CGT TCC CAC GGC GCT 
Tyr Cya Ala Trp Asp Lys He Ala Gly Lys Cys Arg Ser His Ity 111 

f° A S GG CTA GAG GAG AAC TAT TTC TAC CA G AAT GTG GCC ACT GGC 1344 
Pro Arg Trp Leu Glu Glu Asn Tyr Phe Tyr Gin Asn Val Ala Thr gTv 
435 440 445 

1392 



1296 



15 



20 



GlS SI I ? G A?° o GC S CC TCA GGC *** ATC ** T TCA AAG GAT GCC AAC 
Gin His Ala Ala Cys Pro ser Gly Lys He Asn Ser Lys Asp Ala Asn 

455 460 

if, SS SS SS £S S| SS s SS 2S S 2S 25 S S£ S 

**°=» 470 475 ^ 



480 



a™ » GC o AG AGC ** G GAT CAG GAA ATA A TC GAC AAT ATT GAT AAG AAC 
Arg Arg Gin Ser Lye Asp Gin Glu He He Asp Asn 111 Asp JJs ]££ 
" 490 495 



1488 



25 SS SS £ S £S S SS 5S {g SS SS SS 2S SS ffi 1536 

1500 505 510 

GCC GTT CTG GCC GGT TCG ATC TTT TCG CTG CTG GTC GGC TTC TTT ACA 1584 

30 ^ 5?s 61y Ser 116 '5? Ser Leu Le « Val Gly J£ J£ ?£r 84 

520 525 

ss ?| ss gs ss ss || s: ss ss ss ss ss 2; ^ s 1632 

35 aju 535 540 



40 



FJ £ S 2J £ SS 55 SS SS SS SS S SS K SS 

550 555 560 



1680 



1728 



ss ss ss s ser iss s: 2s s ss ss ss sss ss s ss 

565 =70 575 

« sss SS SS SS SS ss s? ss iss s ss s SS s ss 



590 



1824 



1872 



1920 



CCT CCG CCG CCC AAT AAG ATG CAC TCG CCG AAG AAC ACG CTG PPT 

50 Sf Pr ° ASn LyS MSt S - Kb JS 22 £J £s 

600 6Q5 

^ SSS SJ SS ss ss ss ss SS III SSS SS K ss ss ss ss 
55 ss ss ss SS ?S5 SJ ig JS SSS SS III S™ J5 ss ss ss 

635 640 

S £ S Glu SS ? ys ?al J£ £ S 1970 
645 6S0 

65 CGCGGCGATG GCTTTTCCAC CACCCGCAGC GTCAAGAAGG TTTACCTTTG AGACGGGAGT 2030 
GGGGCGGCTG AAACCAGTCA GGGACTAATT ACCCAAAATA TGGCTGTAAA CAACACAAAC 2090 
ACACGTAACA GAAGTCTTGG TCGCGCAAGA AGACAGCCGC CCCGTCATGG CATTGTAACT 2150 



60 



76 



10 



20 
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CAACACCGCT CGaJJ^CC CCAGCAGCAG CAGCAGCAGT CGC^RgCC GCACTCCAGT 2210 
TCGGGCTCCT CGCCCGTAAT GTCCAACAGC AGCAGCAGTC CGGCTCCGCC CTCCAGCAGT 2270 
5 CCCAGTCCGC AGGAGAGCCC CAAGAACTGC AGCTACATCT ACCGTGATTG ATTGATATGC 2330 
AACACCAAAT CGATGCCACT CATCCAGGCC CAGTCCACGC ACGCCCAGCC ACACTCACAC 2390 
CCGCACCCGC ACCCGCTTCC GCCACCCGGT CCGACCACGC CCCCAGCACA GCCACGCGCC 2450 
AGAAGTCCAA TGATCGGCAG GACATATGCC AAGTCCATGC CCGTGACACC AGTTCAACCG 2510 
CAATCGCCGC TGGCTGAGAC GCCCTCCTAT GAGCTCTACG AACGCCACTC GGATGCGGCC 2570 
15 ACCTTCCACT TTGGGGATGA GGACGATGAC GATGATGATG AGCACGACCA GGAGGACACC 2630 
TCATCGCTGG CCATGATCAC ACCGCCGCCG CCCTACGACA CTCCGCATCT GATTGCATCG 2690 
CCACCGCTGC CGCCGCCTCG TAGATTTCGC TTTGGCAACA GGGAGCTGTT CAGCATGAGT 2750 
CCAGCCGGAG GTGGAACCAC GCCCACCGCC TCGGCAGGCC AACGCGGCAG CAGCGCCATC 2810 
ACGCCCACAA AGTTGAGTGC GGCGGCAGCG GCCATGTTTG CCGCACCCCA AATGGCCACC 2870 
25 CAACTCAACC GGAAGTGGGC TCATTTGCAA AGGAAGCGGC GCAGGCGCAA CAGCAGCTCC 2930 
GGCGATTCTA AGGAGCTCGA CAAACTGGTC CTGCAATCGG TCGACTGGGA TGAGAATGAG 2990 
ATGTACTAGA ACGCAAACCA ACAATGAGAT AGCAGAAACA CTTTGATTCG GAATTTATAC 3050 
ACCTTTGCAT ATTTTGAATA TGACTTCAAT TTTAAAATGC GTAATTATGT TCTTATTTTT 3110 
TAAAGAACGC TTTAGAGAAG TTTTCTGCTA CCTTAAATAG TACACACAAC TCATATCTAA 3170 
35 CGTGGCGCTG CGATATAGGA ATAACCACTC CCCCTTCCCT TAAACTTAAA GTAGCAATCG 3230 
AAAAGATCAT TCATTAGCGA CAGAAACTGG ATGGGGATTT ACTTACACAC AAAAAGCCAG 3290 
AGAAGTTATA CACGAAGTTT ATAGTTATAT AGCCTTTATA CATACTCCCC GATCTGCTAA 3350 
GTATACACAA GCAAGCATAA CATAACATAC GTATATATGA CTCTATATAT ACCAATAGAT 3410 
TTCATAGACG ATTCACATGG ATCGGCTACG CTAAATTAGA GCTGCAAAAT GATATTGTTA 3470 
45 ATTACGATTA GAGAAAAAAA AAAAGGAATT CGATATCAAG CKTATCGATA CCNTCGACCT 3530 
CGNNNNNGGG GCCCGGTACC CAATTCGCCC 3560 

50 (2) INFORMATION FOR SEQ ID NO: 60: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 650 amino acids 

(B) TYPE: amino acid 
55 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 60: 

Glu Asp Asp Cys Gin Asn Tyr He Arg He Met Val Val Pro Ser Pro 
15 10 15 

Glv Arg Leu Phe Val Cys Gly Thr Asn Ser Phe Arg Pro Met Cys Asn 
65 20 25 30 

Thr Tyr He He Ser Asp Ser Asn Tyr Thr Leu Glu Ala Thr Lys Asn 
35 40 45 

77 



40 



60 
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W ° 95/07706 PCT/US94/10151 



.i^^l 



Gly Gin AlaWl Cys Pro Tyr Asp Pro Arg His Ser Thr Ser Val 

ao 55 60 

Leu Ala Asp Asn Glu Leu Tyr Ser Gly Thr Val Ala Asp Phe Ser Gly 
-> 65 70 75 ^ -i 



80 



Ser Asp Pro He He Tyr Arg Glu Pro Leu Gin Thr Glu Gin Tyr Aso 
85 90 95 

10 ser Leu Ser Leu Asn Ala Pro Asn Phe Val Ser Ser Phe Thr Gin Gly 
10 ° 105 no 

Asp Phe Val Tyr Phe Phe Phe Arg Glu Thr Ala Val Glu Phe He Asn 
15 115 120 125 

Cys Gly Lys Ala He Tyr Ser Arg Val Ala Arg Val Cys Lys Trp Asp 
J - JU 135 140 

20 Gly GlY Pr ° HiS Arg . Phe Arg Asn Ar * T *P Thr Ser Phe Leu Lys 

ZK) 145 150 155 !6 0 

Ser Arg Leu Asn Cys Ser He Pro Gly Asp Tyr Pro Phe Tyr Phe Asn 
16S 170 175 

25 Glu He Gin Ser Ala Ser Asn Leu Val Glu Gly Gin Tyr Gly ser Met 
180 185 190 

Ser Ser Lys Leu He Tyr Gly Val Phe Asn Thr Pro Ser Asn Ser He 
30 195 200 205 

Pro Gly ser Ala Val Cys Ala Phe Ala Leu Gin Asp He Ala Asp Thr 

215 220 

35 151 oXn G1U Gln Thr Gly Ile Asn Ser Asn Tr P ^u 

230 235 240 

Pro Val Asn Asn Ala Lys Val Pro Asp Pro Arg Pro Gly Ser Cys His 
245 250 255 

40 Asn Asp Ser Arg Ala Leu Pro Asp Pro Thr Leu Asn Phe Ile Lys Thr 
260 265 270 

His Ser Leu Met Asp Glu Asn Val Pro Ala Phe Phe Ser Gln Pro lie 
45 275 280 285 

L6U «n Thr SSr Thr Ile ^ Ar * Phe Thr Gln He Ala Val Asp 

290 295 300 

50 3of P fn Gly Gly LyS Thr Tyr A9 P Val Ile Phe V al 

310 315 320 

Gly Thr Asp His Gly Lys He He Lys Ser Val Asn Ala Glu Ser Ala 
325 330 335 

55 Asp Ser Ala Asp Lys Val Thr Ser Val Val He Glu Glu He Asp Val 
340 345 3so 

Leu Thr Lys Ser Glu Pro He Arg Asn Leu Glu He Val Arg Thr Met 
60 360 365 

Gln Tyr Asp Gln Pro Lys Asp Gly Ser Tyr Asp Asp Gly Lys Leu He 
J/L ' 375 38O 

65 l\% ^ ASP 39S Val Val Ala Ile Leu His Arg Cys His 

Asn Asp Lys Ile Thr Ser Cys Ser Glu Cys Val Ala Leu Gln Asp Pro 
405 410 41 | 
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15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



Tyr Cys Ala Trp^Wp Lys lie 

420 

Pro Arg Trp Leu Glu Glu Asn 
435 

Gin His Ala Ala Cys Pro Ser 
450 455 

Ala Gly Glu Gin Lys Gly Phe 
465 470 

Arg Arg Gin Ser Lys Asp Gin 
485 

Phe Glu Asp lie lie Asn Ala 
500 

Ala Val Leu Ala Gly Ser lie 
515 

Gly Tyr Phe Cys Gly Arg Arg 
530 535 

Pro Tyr Pro Asp Thr Glu Tyr 
545 550 

Asn Ser Phe Pro Ser Ser Cys 
565 

Pro Gin Val Glu Glu Val Thr 
580 

Pro Pro Pro Pro Asn Lys Met 
595 

Pro Pro Met His Gin Met His 
610 615 

Gin Phe His Val Thr Ala Thr 
625 630 

Thr Thr Ser Glu His Cys Val 
645 



Ala Giy Lys Cys 
425 

Tyr Phe Tyr Gin 
440 

Gly Lys lie Asn 



Arg Asn Asp Met 
475 

Glu lie lie Asp 
490 

Gin Tyr Thr Val 
505 

Phe Ser Leu Leu 
520 

Cys His Lys Asp 



Glu Tyr Phe Glu 
555 

Arg lie Gin Gin 
570 

Tyr Ala Asp Ala 
585 

His Ser Pro Lys 
600 

Gin Gly Pro Asn 
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Thr Pro Ser Ser 
635 

Pro Thr Arg 
650 



Arg His Gly Ala 

430 

Asn Val Ala Thr Gly 
445 

Ser Lys Asp Ala Asn 
460 

Asp Leu Leu Asp Ser 
480 

Asn lie Asp Lys Asn 
495 

Glu Thr Leu Val Met 
510 

Val Gly Phe Phe Thr 
525 

Glu Asp Asp Asn Leu 
540 

Gin Arg Gin Asn Val 
560 

Glu Pro Lys Leu Leu 
575 

Val Leu Leu Pro Gin 
590 

Asn Thr Leu Arg Lys 
605 

Ser Glu Thr Leu Phe 
620 

Arg lie Val Val Ala 
640 



(2) INFORMATION FOR SEQ ID NO: 61: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2670 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

( ix ) FEATURE : 

(A) NAME/KEY: CDS 

(B) LOCATION: 268.. 2439 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:61: 
GAAAATCGAA CWCCGAATTG AATGAACWGC AAAACGCCAA 
TGCATTTCAG AKATTTNMMC GATGCGAAAC AAGTTCCGCC 
AATGCCCAAG AATCTCGAGC GGAAACACCA AACACAAAAG 



TTAGATAGTT GCAAGCCTAA 60 
ACGAAAGTGA ACAGTGGTAA 120 
AACAAGCAAC CGCCTCTCAC 180 



BNSDOCID: <WO_9607706A1JU> 
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15 



TCGCTCTTGC 1BI TTAATCC AATTGAGGTT GGTGGGGTCG ^PlCGCCCC CCGGTCGACC 240 

ACCCCTCTCG CTCGCACCGC CCTCGCA ATG TCT CTT CTA CAG CTA TCG CCG CTC 294 

Met Ser Leu Leu Gin Leu Ser Pro Leu 

CTC GCA CTC CTG CTA CTC CTC TGC AGT AGT GTG AGC GAG ACG GCT GCG 342 
Leu Ala Leu Leu Leu Leu Leu Cys Ser Ser Val Ser Glu Thr Ala Ala 
10 !5 20 25 

GAC TAC GAG AAC ACC TGG AAC TTC TAC TAC GAG CGT CCC TGT TGC ACT 390 
Asp Tyr Glu Asn Thr Trp Aan Phe Tyr Tyr Glu Arg Pro Cys Cys Thr 
30 35 40 

GGA AAC GAT CAG GGG AAC AAC AAT TAC GGA AAA CAC GGC GCA GAT CAT 438 
Gly Asn Asp Gin Gly Asn Asn Asn Tyr Gly Lys His Gly Ala Asp His 
45 50 55 



GTG CGG GAG TTC AAC TGC GGC AAG CTG TAC TAT CGT ACA TTC CAT ATG 486 
Val Arg Glu Phe Asn Cys Gly Lys Leu Tyr Tyr Arg Thr Phe His Met 
^ u 60 65 70 



25 



30 



35 



AAC GAA GAT CGA GAT ACG CTC TAT GTG GGA GCC ATG GAT CGC GTA TTC 534 
Asn Glu Asp Arg Asp Thr Leu Tyr Val Gly Ala Met Asp Arg Val Phe 
75 80 85 



45 



50 



55 



CGT TTG GAA TAT AAA TTC AAG AGG ACT CTG AAA TAC GAC TCC AAG TGG 
Arg Leu Glu Tyr Lys Phe Lys Arg Thr Leu Lys Tyr Asp Ser Lys Trp 

65 235 240 245 

TTG GAC AAA CCA AAC TTT GTC GGC TCC TTT GAT ATT GGG GAG TAC GTG 



582 



CGT GTG AAC CTG CAG AAT ATC TCC TCA TCC AAT TGT AAT CGG GAT GCG 
Arg Val Asn Leu Gin Asn lie Ser Ser Ser Asn Cys Asn Arg Asp Ala 
90 95 100 105 

?T« r TG n?° o CA £u A CGG GAT GAT GTG GTT AGC TGC GTC TCC AAA 630 

He Asn Leu Glu Pro Thr Arg Asp Asp Val Val Ser Cys Val Ser Lys 

110 US 120 

GGC AAA AGT CAG ATC TTC GAC TGC AAG AAC CAT GTG CGT GTC ATC CAG 678 
Gly Lys Ser Gin He Phe Asp Cys Lys Asn His Val Arg Val lie Gin 
125 130 135 

TCA ATG GAC CAG GGG GAT AGG CTC TAT GTA TGC GGC ACC AAC GCC CAC 
Ser Met Asp Gin Gly Asp Arg Leu Tyr Val Cys Gly Thr Asn Ala His 
u 145 150 



726 



774 



822 



AAT CCC AAG GAT TAT GTT ATC TAT GCG AAT CTA ACC CAC CTG CCG CGC 
Asn Pro Lys Asp Tyr Val He Tyr Ala Asn Leu Thr His Leu Pro Arg 
155 160 165 

c CG ^ ™ T ? r TG A T T GGC GTG GGT CTG G GC ATT GCC AAG TGC CCC TAC 
Ser Glu Tyr Val He Gly Val Gly Leu Gly He Ala Lys Cys Pro Tyr 
170 175 180 185 

GAT CCC CTC GAC AAC TCA ACT GCG ATT TAT GTG GAG AAT GGC AAT CCG 870 
Asp Pro Leu Asp Asn Ser Thr Ala He Tyr Val Glu Asn Gly Asn Pro 
190 195 200 

GGT GGT CTG CCC GGT TTG TAC TCC GGC ACC AAT GCG GAG TTC ACC AAG 918 
Gly Gly Leu Pro Gly Leu Tyr Ser Gly Thr Asn Ala Glu Phe £hr £Js 
205 210 215 

A k G , G , T 7 xT T T u C CGC ACT GAT CTG TAT ACT TCG GCT AAA 966 

Ala Asp Thr Val He Phe Arg Thr Asp Leu Tyr Asn Thr Ser Ala Lys 
DU 220 225 230 



1014 



1062 



80 



1110 



1158 
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Leu Asp Lys Pro^ Phe Val Gly Ser Pha Asp He W Glu Tyr Val 

250 255 260 

TAT TTC TTT TTC CGT GAA ACC GCC GTG GAA TAC ATC AAC TGC GGC AAG 

5 ?yr He lie He Arg Glu Thr Ala Val Glu Tyr He Asn Cys Gly Lye 

270 27b 

ss ^ £ s s s ss ss ?s ss £ ss °»p i ss ss 

10 285 290 295 

AAG AAT CTG CTG GCC CAC AAC TGG GCC ACC TAC CTG AAG GCC AGA CTC 
LyB Mn Su £eu Ala His Asn Trp Ala Thr Tyr Leu Lys Ala Arg Leu 
* 300 305 3 

15 AAC TGC AGC ATC TCC GGC GAA TTT CCG TTC TAT TTC AAC GAG ATC CAA 1254 
£n Cys sir He Ser Gly Glu Phe Pro Phe Tyr Phe Asn Glu He Gin 

315 320 325 

20 E SS £ SS SS £ S SS 5S £ S SS SS SS i£ S 

330 335 340 

... Arr AGC ACT aat GGC CTG ATT GGA TCT GCC GTA TGC AGT TTC CAC 
25 i£ Tar sir iS £n G?y Leu lie Gly Ser Ala Val Cys Ser Phe Hx. 

350 355 JOU 

ATT AAC GAG ATT CAG GOT GCC TTC AAT GGC AAA TTC AAG GAG CAA TCT 1398 
S j£E Glu lie Gin Ala Ala Phe Asn Gly Lys Phe Lys Glu Gin Ser 

30 365 370 J 

TCA TCG AAT TCC GCA TGG CTG CCG GTG CTT AAC TCC CGG GTG CCG GAA 1446 
ser Ser J£J Ser Ala Trp Leu Pro Val Leu Asn Ser Arg Val Pro Glu 
380 385 390 

35 _ rp r a aar r.TC CCC GAT ACC 1494 



1302 



1350 



1542 



1590 



1638 



1686 



pp . rrr rcG GGT ACA TGT GTC AAC GAT ACA TCA AAC CTG CCC GAT ACC 
S £g Pro gS ?S lyl Val Asn Asp Thr Ser Asn Leu Pro Asp Thr 
395 400 405 

40 GTA CTG AAT TTC ATC AGA TCC CAT CCA CTT ATG GAC AAA GCC GTA AAT 
40 Hi III £n ?he lie Arg Ser His Pro Leu Met Asp Lys Ala Val Asn 
410 415 

CAC GAG CAC AAC AAT CCA GTC TAT TAT AAA AGG GAT TTG GTC TTC ACC 
45 His Glu His Asn Asn Pro Val Tyr Tyr Lys Arg Asp Leu Val Phe Thr 

430 435 

AAG CTC GTC GTT GAC AAA ATT CGC ATT GAC ATC CTC AAC CAG GAA TAC 
Lys Leu Val Val Asp Lys He Arg He Asp He Leu Asn Gin Glu Tyr 
50 445 450 4 " 

ATT GTG TAC TAT GTG GGC ACC AAT CTG GGT CGC ATT TAC AAA ATC GTG 
He val lyr Tyr Val Gly Thr Asn Leu Gly Arg He Tyr Lys He Val 
460 465 

55 CAG TAC TAC CGT AAC GGA GAG TCG CTG TCC AAG CTT CTG GAT ATC TTC 1734 
SS lyr Ty~r Arg Asn Gly Glu Ser Leu Ser Lys Leu Leu Asp He Phe 
475 480 4B = 

60 til SS SS £ £S SS SS ill Si 52 S SIS iS iSS SS £ 

490 495 500 

*rr rrr TAC ATT GGC ACC GAT CAT CGC ATC AAG CAA ATC GAC 
65 £g 2ys ser 2S Tyr He Gly Thr Asp His Arg He Lys Gin He Asp 

CTG GCC ATG TGC AAT CGC CGT TAC GAC AAC TGC TTC CGC TGC GTC CGT 



1782 
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Leu Ala Me^^s Asn Arg Arg Tyr Asp Asn Cys^R Arg Cys Val Ara 

525 530 535 

GAT CCC TAC TGC GGC TGG GAT AAG GAG GCC AAT ACG TGC CGA CCG TAC 1926 
3 Asp Pro Tyr Cys Gly Trp Asp Lys Glu Ala Asn Thr Cys Arg Pro Tyr 
540 545 550 

GAG CTG GAT TTA CTG CAG GAT GTG GCC AAT GAA ACG AGT GAC ATT TGC 1974 
Glu Leu Asp Leu Leu Gin Asp Val Ala Asn Glu Thr Ser Asp He Cys 
10 555 560 565 

GAT TCG AGT GTG CTG AAA AAG AAG ATT GTG GTG ACC TAT GGC CAG AGT 2028 
Asp Ser Ser Val Leu Lys Lys Lys He Val Val Thr Tyr Gly Gin Ser 
570 575 580 585 

GTA CAT CTG GGC TGT TTC GTC AAA ATA CCC GAA GTG CTG AAG AAT GAG 2070 
Val His Leu Gly Cys Phe Val Lys He Pro Glu Val Leu Lys Asn Glu 
590 595 600 

20 CAA GTG ACC TGG TAT CAT CAC TCC AAG GAC AAG GGA CGC TAC GAG ATT 2118 
Gin Val Thr Trp Tyr His His Ser Lys Asp Lys Gly Arg Tyr Glu He 
60S 610 615 

CGT TAC TCG CCG ACC AAA TAC ATT GAG ACC ACC GAA CGT GGC CTG GTT 2166 
Arg Tyr Ser Pro Thr Lys Tyr He Glu Thr Thr Glu Arg Gly Leu Val 
620 625 630 

GTG GTT TCC GTG AAC GAA GCC GAT GGT GGT CGG TAC GAT TGC CAT TTG 2214 
Val Val Ser Val Asn Glu Ala Asp Gly Gly Arg Tyr Asp Cys His Leu 
3V 635 640 645 

GGC GGC TCG CTT TTG TGC AGC TAC AAC ATT ACA GTG GAT GCC CAC AGA 2262 
Gly Gly Ser Leu Leu Cys Ser Tyr Asn He Thr Val Asp Ala His Arq 
650 655 660 665 

TGC ACT CCG CCG AAC AAG AGT AAT GAC TAT CAG AAA ATC TAC TCG GAC 2310 
Cys Thr Pro Pro Asn Lys Ser Asn Asp Tyr Gin Lys lie Tyr Ser Asp 
670 675 680 



25 



35 



45 



40 TGG TGC CAC GAG TTC GAG AAA TAC AAA ACA GCA ATG AAG TCC TGG GAA 2358 
Trp Cys His Glu Phe Glu Lys Tyr Lys Thr Ala Met Lys Ser Trp Glu 
685 690 695 

AAG AAG CAA GGC CAA TGC TCG ACA CGG CAG AAC TTC AGC TGC AAT CAG 2406 
Lys Lys Gin Gly Gin Cys Ser Thr Arg Gin Asn Phe Ser Cys Asn Gin 
700 705 710 

CAT CCG AAT GAG ATT TTC CGT AAG CCC AAT GTC TGATATCACG AAG AG AG TAT 2459 
His Pro Asn Glu lie Phe Arg Lys Pro Asn Val 
50 715 720 

CGCCCTCAAA ATGCCGTCAT CGTCGTCCAA TCAATTTTAG TTAATCGAAA GCGAAGAGGA 2 519 
TAATAACAGT GCGGAATAGA AAGCCCAGGA CGAGAAGAAC TCATTATAAT CATTATTATC 2579 
AGCGACATCA T CAT AG AC AT ACTTTCTTCA GCAATGAACA GAAAACTCTT CCTAAAGGAT 2636 
TATGCATTTA CCGAAGCATT TACAATG CAT C 2670 



60 



(2) INFORMATION FOR SEQ ID NO: 62; 



(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 724 amino acids 
to (B) TYPE: amino acid 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 
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(xi) SEQuljl DESCRIPTION: SEQ ID NO: 62: 

Met Ser Leu Leu Gin Leu Ser Pro Leu Leu Ala Leu Leu Leu Leu Leu 
1 5 10 15 

5 cys Ser Ser Val Ser Glu Thr Ala Ala Asp Tyr Glu Asn Thr Trp Asn 
20 25 30 

Phe Tyr Tyr Glu Arg Pro cys Cys Thr Gly Asn Asp Gin Gly Asn Asn 
10 35 40 45 

Asn Tyr Gly Lys His Gly Ala Asp His Val Arg Glu Phe Asn Cys Gly 
50 55 60 

15 Lys Leu Tyr Tyr Arg Thr Phe His Met Asn Glu Asp Arg Asp Thr Leu 
65 70 75 

Tyr Val Gly Ala Met Asp Arg Val Phe Arg Val Asn Leu Gin Asn lie 
85 90 95 

20 

Ser Ser Ser Asn Cys Asn Arg Asp Ala He Asn Leu Glu Pro Thr Arg 
100 105 HO 

Asp Asp Val Val Ser Cys Val Ser Lys Gly LyB Ser Gin He Phe Asp 
25 H5 120 125 

Cys Lys Asn His Val Arg Val He Gin Ser Met Asp Gin Gly Asp Arg 
y 130 135 140 

30 Leu Tyr Val Cys Gly Thr Asn Ala His Asn Pro Lys Asp Tyr Val lie 
145 150 155 160 

Tyr Ala Asn Leu Thr His Leu Pro Arg Ser Glu Tyr Val He Gly Val 
* 165 170 175 

35 

Gly Leu Gly He Ala Lys Cys Pro Tyr Asp Pro Leu Asp Asn Ser Thr 
180 185 I 90 

Ala He Tyr Val Glu Asn Gly Asn Pro Gly Gly Leu Pro Gly Leu Tyr 
40 195 200 205 

Ser Gly Thr Asn Ala Glu Phe Thr Lys Ala Asp Thr Val He Phe Arg 
210 215 220 

45 Thr Asp Leu Tyr Asn Thr Ser Ala Lys Arg Leu Glu Tyr Lys Phe Lys 
225 230 235 240 

Arg Thr Leu Lys Tyr Asp Ser Lys Trp Leu Asp Lys Pro Asn Phe Val 
245 250 255 

50 Gly Ser Phe Asp He Gly Glu Tyr Val Tyr Phe Phe Phe Arg Glu Thr 
260 265 270 

Ala Val Glu Tyr He Asn Cys Gly Lys Ala Val Tyr Ser Arg He Ala 
55 275 280 285 

Ara Val Cys Lys Lys Asp Val Gly Gly Lys Asn Leu Leu Ala His Asn 
290 295 300 

60 Trp Ala Thr Tyr Leu Lys Ala Arg Leu Asn Cys Ser He Ser Gly Glu 



305 310 

65 



Phe Pro Phe Tyr Phe Asn Glu He Gin Ser Val Tyr Gin Leu Pro Ser 
325 330 335 

Asp Lys Ser Arg Phe Phe Ala Thr Phe Thr Thr Ser Thr Asn Gly Leu 
340 345 350 
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He Gly Ser"WPa Val Cys Ser Phe His He Asn WKf lie Gin Ala Ala 

355 360 365 

Phe Asn Gly Lys Phe Lys Glu Gin Ser Ser Ser Asn Ser Ala Trp Leu 
5 370 375 380 

Pro Val Leu Asn Ser Arg Val Pro Glu Pro Arg Pro Gly Thr Cys Val 
385 390 395 400 

10 Asn Asp Thr Ser Asn Leu Pro Asp Thr Val Leu Asn Phe He Arg Ser 

405 410 415 

His Pro Leu Met Asp Lys Ala Val Asn His Glu His Asn Asn Pro Val 
15 420 425 430 

Tyr Tyr Lys Arg Asp Leu Val Phe Thr Lys Leu Val Val Asp Lys He 
435 440 445 

Arg He Asp He Leu Asn Gin Glu Tyr He Val Tyr Tyr Val Glv Thr 
20 450 455 460 

Asn Leu Gly Arg He Tyr Lys He Val Gin Tyr Tyr Arg Asn Gly Glu 
465 470 475 480 

25 Ser Leu Ser Lys Leu Leu Asp He Phe Glu Val Ala Pro Asn Glu Ala 

485 490 495 

He Gin Val Met Glu He Ser Gin Thr Arg Lys Ser Leu Tyr He Glv 
30 500 505 510 

Thr Asp His Arg He Lys Gin He Asp Leu Ala Met Cys Asn Arg Ara 
515 520 525 

Tyr Asp Asn Cys Phe Arg Cys Val Arg Asp Pro Tyr Cys Gly Trp Asp 
33 530 535 540 

Lys Glu Ala Asn Thr Cys Arg Pro Tyr Glu Leu Asp Leu Leu Gin Asp 
545 550 555 560 

40 Val Ala Asn Glu Thr Ser Asp He Cys Asp Ser Ser Val Leu Lys Lvs 

565 570 575 

Lys He Val Val Thr Tyr Gly Gin Ser Val His Leu Gly Cys Phe Val 
45 580 585 590 

Lys He Pro Glu Val Leu Lys Asn Glu Gin Val Thr Trp Tyr His His 
595 600 605 

Ser Lys Asp Lys Gly Arg Tyr Glu He Arg Tyr Ser Pro Thr Lys Tyr 
30 610 615 620 

He Glu Thr Thr Glu Arg Gly Leu Val Val Val Ser Val Asn Glu Ala 
625 630 635 640 

55 Asp Gly Gly Arg Tyr Asp Cys His Leu Gly Gly Ser Leu Leu Cys Ser 

645 650 655 

Tyr Asn He Thr Val Asp Ala His Arg Cys Thr Pro Pro Asn Lys Ser 
6Q 660 665 670 

Asn Asp Tyr Gin Lys He Tyr Ser Asp Trp Cys His Glu Phe Glu Lys 
675 680 685 

Tyr Lys Thr Ala Met Lys Ser Trp Glu Lys Lys Gin Gly Gin Cys Ser 
63 690 695 700 

Thr Arg Gin Asn Phe Ser Cys Asn Gin His Pro Asn Glu He Phe Arc 
705 710 715 720 
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(2) INFORMATION FOR SEQ ID NO: 63: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2504 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
10 (D ) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 

( ix ) FEATURE : 
15 (A) NAME /KEY : CDS 

(B) LOCATION: 355.. 2493 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 63: 

20 GGCCGGTCGA CCACGAGCGA AGTTTAGTAT CAAGTTGAGA GTTTGTTTGG AGCGTAGTTT 60 

ACGGAGCGTA CATTTAAATT TGCGGACAAA TCGTGTTTTG GTGCTTCTCT GTGGATTGTT 120 

GTGTTCTTGA AGATGCTTCC CTTGGTTTTC GGATAAGCTT TCCTGTGGAT TGTTGTGTTC 180 

TTGAAGATGC TTCCCTTGGT TTTCGGATAA GCTTTCCAGC GTGGTTTCAG CCTCGGCTTG 240 

TTTGGACCCC GACATAATCT TCGAACTACA ATGAAGAGGA AATTTTGAAA CGCGTTTCAG 300 

30 ACGCGTACAA TCGACAAAAT GTTTGGTTTC CAATTGATCT TGCAATGTAG CTAC ATG 357 

Met 
1 

GTG GTG AAG ATC TTG GTT TGG TCG ATA TGT CTG ATA GCG CTG TGT CAT 405 
35 Val Val Lys lie Leu Val Trp Ser He Cys Leu He Ala Leu Cys His 

5 10 15 

GCT TGG ATG CCG GAT AGT TCT TCC AAA TTA ATA AAC CAT TTT AAA TCA 453 
Ala Trp Met Pro Asp Ser Ser Ser Lys Leu He Asn His Phe Lys Ser 
40 20 25 30 

GTT GAA AGT AAA AGC TTT ACC GGG AAC GCC ACG TTC CCT GAT CAC TTT 501 

Val Glu Ser Lys Ser Phe Thr Gly Asn Ala Thr Phe Pro Asp His Phe 

35 40 45 

45 

ATT GTC TTG AAT CAA GAC GAA ACT TCG ATA TTA GTA GGC GGT AGA AAT 549 

He Val Leu Asn Gin Asp Glu Thr Ser He Leu Val Gly Gly Arg Asn 

50 55 60 65 

50 AGG GTT TAC AAT TTA AGT ATA TTC GAC CTC AGT GAG CGT AAA GGG GGG 597 
Arg Val Tyr Asn Leu Ser He Phe Asp Leu Ser Glu Arg Lys Gly Gly 
70 75 80 

CGA ATC GAC TGG CCA TCG TCC GAT GCA CAT GGC CAG TTG TGT ATA TTG 645 
55 Arg He Asp Trp Pro Ser Ser Asp Ala His Gly Gin Leu Cys He Leu 
85 90 95 

AAA GGG AAA ACG GAC GAC GAC TGC CAA AAT TAC ATT AGA ATA CTG TAC 693 
Lys Gly Lys Thr Asp Asp Asp Cys Gin Asn Tyr He Arg He Leu Tyr 
60 100 105 no 

TCT TCA GAA CCG GGG AAA TTA GTT ATT TGC GGG ACC AAT TCG TAC AAA 741 
Ser Ser Glu Pro Gly Lys Leu Val He Cys Gly Thr Asn Ser Tyr Lys 
115 120 125 

CCC CTC TGT CGG ACG TAC GCA TTT AAG GAG GGA AAG TAC CTG GTT GAG 789 
Pro Leu Cys Arg Thr Tyr Ala Phe Lys Glu Gly Lys Tyr Leu Val Glu 
130 135 140 145 



65 
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# et 

AAA GAA GTJWAA GGG ATA GGC TTG TGT CCA TAd^R* CCG GAA CAC AAC 837 

Lys Glu Val Glu Gly lie Gly Leu Cys Pro Tyr Asn Pro Glu His Asn 
150 155 160 

5 AGC ACA TCT GTC TCC TAC AAT GGC CAA TTA TTT TCA GCG ACG GTC GCC 885 
Ser Thr Ser Val Ser Tyr Asn Gly Gin Leu Phe Ser Ala Thr Val Ala 
165 170 175 

GAC TTT TCC GGG GGC GAC CCT CTC ATA TAC AGG GAG CCC CAG CGC ACC 933 
Asp Phe Ser Gly Gly Asp Pro Leu He Tyr Arg Glu Pro Gin Arg Thr 
180 185 190 

GAA CTC TCA GAT CTC AAA CAA CTG AAC GCA CCG AAT TTC GTA AAC TCG 981 
Glu Leu Ser Asp Leu Lys Gin Leu Asn Ala Pro Asn Phe Val Asn Ser 
15 195 200 205 

GTG GCC TAT GGC GAC TAC ATA TTC TTC TTC TAC CGT GAA ACC GCC GTC 1029 
Val Ala Tyr Gly Asp Tyr He Phe Phe Phe Tyr Arg Glu Thr Ala Val 
2Q 210 215 220 225 

GAG TAC ATG AAC TGC GGA AAA GTC ATC TAC TCG CGG GTC GCC AGG GTG 1077 
Glu Tyr Met Asn Cys Gly Lys Val He Tyr Ser Arg Val Ala Arg Val 
230 235 240 

TGC AAG GAC GAC AAA GGG GGC CCT CAC CAG TCA CGC GAC CGC TGG ACG 1125 
Cys Lys Asp Asp Lys Gly Gly Pro His Gin Ser Arg Asp Arg Trp Thr 
245 250 255 

TCG TTC CTC AAA GCA CGT CTC AAT TGT TCA ATT CCC GGC GAG TAC CCC 1173 
Ser Phe Leu Lys Ala Arg Leu Asn Cys Ser He Pro Gly Glu Tyr Pro 
260 265 270 

TTT TAC TTT GAT GAA ATC CAA TCA ACA AGT GAT ATA GTC GAG GGT CGG 1221 
Phe Tyr Phe Asp Glu He Gin Ser Thr Ser Asp He Val Glu Gly Ara 
35 275 280 285 



25 



30 



40 



45 



50 



TAC AAT TCC GAC GAC AGC AAA AAG ATC ATT TAT GGA ATC CTC ACA ACT 1269 
Tyr Asn Ser Asp Asp Ser Lys Lys He He Tyr Gly He Leu Thr Thr 
290 295 300 305 

CCA GTT AAT GCC ATC GGC GGC TCG GCC ATT TGC GCG TAT CAA ATG GCC 1317 
Pro Val Asn Ala He Gly Gly Ser Ala He Cys Ala Tyr Gin Met Ala 
310 315 320 

GAC ATC TTG CGC GTG TTT GAA GGG AGC TTC AAG CAC CAA GAG ACG ATC 1365 
Asp He Leu Arg Val Phe Glu Gly Ser Phe Lys His Gin Glu Thr He 
325 330 335 

AAC TCG AAC TGG CTC CCC GTG CCC CAG AAC CTA GTC CCT GAA CCC AGG 1413 
Asn Ser Asn Trp Leu Pro Val Pro Gin Asn Leu Val Pro Glu Pro Arg 
340 345 350 

CCC GGG CAG TGC GTA CGC GAC AGC AGG ATC CTG CCC GAC AAG AAC GTC 1461 
Pro Gly Gin Cys Val Arg Asp Ser Arg He Leu Pro Asp Lys Asn Val 
55 355 360 365 

AAC TTT ATT AAG ACC CAC TCT TTG ATG GAG GAC GTT CCG GCT CTT TTC 1509 
Asn Phe He Lys Thr His Ser Leu Met Glu Asp Val Pro Ala Leu Phe 
370 375 380 385 

GGA AAA CCA GTT CTG GTC CGA GTG AGT CTG CAG TAT CGG TTT ACA GCC 1557 
Gly Lys Pro Val Leu Val Arg Val Ser Leu Gin Tyr Arg Phe Thr Ala 
390 395 400 

ATA ACA GTG GAT CCA CAA GTG AAA ACA ATC AAT AAT CAG TAT CTC GAT 1605 
He Thr Val Asp Pro Gin Val Lys Thr He Asn Asn Gin Tyr Leu Asp 
405 410 415 
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GTT TTG TAT AT^^A ACA GAT GAT GGG AAG GTA CTA^^ GCT GTT AAT 1653 

Val Leu Tyr lie Gly Thr Asp Asp Gly Lys Val Leu Lys Ala Val Asn 
420 425 430 

5 ATA CCA AAG CGA CAC GCT AAA GCG TTG TTA TAT CGA AAA TAC CGT ACA 1701 

He Pro Lys Arg His Ala Lys Ala Leu Leu Tyr Arg Lys Tyr Arg Thr 
435 440 445 

TCC GTA CAT CCG CAC GGA GCT CCC GTA AAA CAG CTG AAG ATC GCT CCC 1749 

10 Ser Val His Pro His Gly Ala Pro Val Lys Gin Leu Lys He Ala Pro 
450 455 460 465 

GGT TAT GGC AAA GTT GTG GTG GTC GGG AAA GAC GAA ATC AGA CTT GCT 1797 

Glv Tyr Gly Lys Val Val Val Val Gly Lys Asp Glu He Arg Leu Ala 
15 470 475 480 



20 



40 



60 



AAT CTC AAC CAT TGT GCA AGC AAA ACG CGG TGC AAG GAC TGT GTG GAA 1845 
Asn Leu Asn His Cys Ala Ser Lys Thr Arg Cys Lys Asp Cys Val Glu 
485 490 495 

CTG CAA GAC CCA CAT TGC GCC TGG GAC GCC AAA CAA AAC CTG TGT GTC 1893 
Leu Gin Asp Pro His Cys Ala Trp Asp Ala Lys Gin Asn Leu Cys Val 
500 505 510 



25 AGC ATT GAC ACC GTC ACT TCG TAT CGC TTC CTG ATC CAG GAC GTA GTT 1941 
Ser He Asp Thr Val Thr Ser Tyr Arg Phe Leu He Gin Asp Val Val 
515 520 525 

CGC GGC GAC GAC AAC AAA TGT TGG TCG CCG CAA ACA GAC AAA AAG ACT 1989 
30 Arg Gly Asp Asp Asn Lys Cys Trp Ser Pro Gin Thr Asp Lys Lys Thr 
530 535 540 545 

GTG ATT AAG AAT AAG CCC AGC GAG GTT GAG AAC GAG ATT ACG AAC TCC 2037 
Val He Lys Asn Lys Pro Ser Glu Val Glu Asn Glu He Thr Asn Ser 
35 550 555 560 

ATT GAC GAA AAG GAT CTC GAT TCA AGC GAT CCG CTC ATC AAA ACT GGT 2085 
He Asp Glu Lys Asp Leu Asp Ser Ser Asp Pro Leu He Lys Thr Gly 
565 570 575 



CTC GAT GAC GAT TCC GAT TGT GAT CCA GTC AGC GAG AAC AGC ATA GGC 2133 
Leu Asp Asp Asp Ser Asp Cys Asp Pro Val Ser Glu Asn Ser He Gly 
580 585 590 



45 GGA TGC GCC GTC CGC CAG CAA CTT GTT ATA TAC ACA GCT GGG ACT CTA 2181 
Gly Cys Ala Val Arg Gin Gin Leu Val He Tyr Thr Ala Gly Thr Leu 
595 600 605 

CAC ATT GTC GTG GTC GTC GTC AGC ATC GTG GGT TTA TTT TCT TGG CTT 222 9 
50 His He Val Val Val Val Val Ser He Val Gly Leu Phe Ser Trp Leu 
610 615 620 625 

TAT AGC GGG TTA TCT GTT TTC GCA AAA TTT CAC TCG GAT TCG CAA TAT 2277 
Tyr Ser Gly Leu Ser Val Phe Ala Lys Phe His Ser Asp Ser Gin Tyr 
55 630 635 640 

CCT GAG GCG CCG TTT ATA GAG CAG CAC AAT CAT TTG GAA AGA TTA AGC 232 5 
Pro Glu Ala Pro Phe He Glu Gin His Asn His Leu Glu Arg Leu Ser 
645 650 655 



GCC AAC CAG ACG GGG TAT TTG ACT CCG AGG GCC AAT AAA GCG GTC AAT 2373 
Ala Asn Gin Thr Gly Tyr Leu Thr Pro Arg Ala Asn Lys Ala Val Asn 
660 665 670 



65 TTG GTG GTG AAG GTG TCT AGT AGC ACG CCG CGG CCG AAA AAG GAC AAT 2421 
Leu Val Val Lys Val Ser Ser Ser Thr Pro Arg Pro Lys Lys Asp Asn 
675 680 685 
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CTC GAT GTC^PGC AAA GAC TTG AAC ATT GCG AGT^BR GGG ACT TTG CAA 2469 
Leu Asp Val Ser Lys Asp Leu Asn lie Ala Ser Asp Gly Thr Leu Gin 
690 695 700 705 

5 AAA ATC AAG AAG ACT TAC ATT TAGTGCGACT TTTT 2504 
Lys lie Lys Lys Thr Tyr lie 
710 

10 (2) INFORMATION FOR SEQ ID NO: 64: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 712 amino acids 

( B ) TYPE: amino acid 
15 (D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 64: 

20 

Met Val Val Lys lie Leu Val Trp Ser lie Cys Leu lie Ala Leu Cys 
15 10 15 

His Ala Trp Met Pro Asp Ser Ser Ser Lys Leu lie Asn His Phe Lys 
25 20 25 30 

Ser Val Glu Ser Lys Ser Phe Thr Gly Asn Ala Thr Phe Pro Asp His 
35 40 45 

30 Phe lie Val Leu Asn Gin Asp Glu Thr Ser lie Leu Val Gly Gly Arg 
50 55 60 

Asn Arg Val Tyr Asn Leu Ser lie Phe Asp Leu Ser Glu Arg Lys Gly 

35 65 70 75 80 

Gly Arg lie Asp Trp Pro Ser Ser Asp Ala His Gly Gin Leu Cys lie 
85 90 95 

Leu Lys Gly Lys Thr Asp Asp Asp Cys Gin Asn Tyr He Arg He Leu 
40 100 105 HO 

Tyr Ser Ser Glu Pro Gly Lys Leu Val lie Cys Gly Thr Asn Ser Tyr 
115 120 125 

45 Lys Pro Leu Cys Arg Thr Tyr Ala Phe Lys Glu Gly Lys Tyr Leu Val 
130 135 140 

Glu Lys Glu Val Glu Gly He Gly Leu Cys Pro Tyr Asn Pro Glu His 
5Q I 45 150 155 160 

Asn Ser Thr Ser Val Ser Tyr Asn Gly Gin Leu Phe Ser Ala Thr Val 
165 170 175 

Ala Asp Phe Ser Gly Gly Asp Pro Leu He Tyr Arg Glu Pro Gin Arg 
55 180 185 190 

Thr Glu Leu Ser Asp Leu Lys Gin Leu Asn Ala Pro Asn Phe Val Asn 
195 200 205 

60 Ser Val Ala Tyr Gly Asp Tyr He Phe Phe Phe Tyr Arg Glu Thr Ala 
210 215 220 

Val Glu Tyr Met Asn Cys Gly Lys Val He Tyr Ser Arg Val Ala Arg 
, c 225 230 235 240 

65 

Val Cys Lys Asp Asp Lys Gly Gly Pro His Gin Ser Arg Asp Arg Trp 
245 250 255 
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Thr Ser Phe LeSBIs Ala Arg Leu Asn Cys Ser Ile^Wo Gly Glu Tyr 

260 265 270 

Pro Phe Tyr Phe Asp Glu lie Gin Ser Thr Ser Asp lie Val Glu Gly 
5 275 280 285 

Arg Tyr Asn Ser Asp Asp Ser Lys Lys lie He Tyr Gly He Leu Thr 
290 295 300 

10 Thr Pro Val Asn Ala He Gly Gly Ser Ala He Cys Ala Tyr Gin Met 
305 310 315 320 

Ala Asp He Leu Arg Val Phe Glu Gly Ser Phe Lys His Gin Glu Thr 
15 325 330 335 

He Asn Ser Asn Trp Leu Pro Val Pro Gin Asn Leu Val Pro Glu Pro 
340 345 350 

20 Arg Pro Gly Gin Cys Val Arg Asp Ser Arg He Leu Pro Asp Lys Asn 
355 360 365 

Val Asn Phe He Lys Thr His Ser Leu Met Glu Asp Val Pro Ala Leu 
370 375 380 

Phe Glv Lys Pro Val Leu Val Arg Val Ser Leu Gin Tyr Arg Phe Thr 
385 390 395 400 

Ala He Thr Val Asp Pro Gin Val Lys Thr He Asn Asn Gin Tyr Leu 
30 405 410 415 

Asp Val Leu Tyr He Gly Thr Asp Asp Gly Lys Val Leu Lys Ala Val 
420 425 430 

35 Asn He Pro Lys Arg His Ala Lys Ala Leu Leu Tyr Arg Lys Tyr Arg 
435 440 445 

Thr Ser Val His Pro His Gly Ala Pro Val Lys Gin Leu Lys He Ala 
450 455 460 

40 

Pro Gly Tyr Gly Lys Val Val Val Val Gly Lys Asp Glu He Arg Leu 
465 470 475 480 

Ala Asn Leu Asn His Cys Ala Ser Lys Thr Arg Cys Lys Asp Cys Val 
45 485 490 495 

Glu Leu Gin Asp Pro His Cys Ala Trp Asp Ala Lys Gin Asn Leu Cys 
500 505 510 

50 Val Ser He Asp Thr Val Thr Ser Tyr Arg Phe Leu He Gin Asp Val 
515 520 525 

Val Arg Gly Asp Asp Asn Lys Cys Trp Ser Pro Gin Thr Asp Lys Lys 
530 535 540 

Thr Val He Lys Asn Lys Pro Ser Glu Val Glu Asn Glu He Thr Asn 
545 550 555 560 



55 



Ser He Asp Glu Lys Asp Leu Asp Ser Ser Asp Pro Leu He Lys Thr 
60 565 570 575 

Gly Leu Asp Asp Asp Ser Asp Cys Asp Pro Val Ser Glu Asn Ser He 
* 580 585 590 

65 Glv Glv Cys Ala Val Arg Gin Gin Leu Val He Tyr Thr Ala Gly Thr 
11 595 600 60S 
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Leu His Ile ^51 Val Val Val Val'ser He Val Leu Phe Ser Trp 

610 615 620 

Leu Tyr Ser Gly Leu Ser Val Phe Ala Lys Phe His Ser Asp Ser Gin 
-> 625 630 635 640 

Tyr Pro Glu Ala Pro Phe He Glu Gin His Asn His Leu Glu Arg Leu 
645 650 655 

10 Ser Ala Asn Gin Thr Gly Tyr Leu Thr Pro Arg Ala Asn Lys Ala Val 
660 665 670 

Asn Leu Val Val Lys Val Ser Ser Ser Thr Pro Arg Pro Lys Lys Asp 
15 675 680 685 

Asn Leu Asp Val Ser Lys Asp Leu Asn He Ala Ser Asp Gly Thr Leu 
690 695 700 

Gin Lys He Lys Lys Thr Tyr He 
20 705 710 



(2) INFORMATION FOR SEQ ID NO: 65: 

25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 369 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 



30 



50 



60 



(ii) MOLECULE TYPE: cDNA 



(ix) FEATURE: 

(A) NAME /KEY : CDS 
35 (B) LOCATION: 1..369 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 65: 

ATG ATT TAT TTA TAC ACG GCG GAT AAC GTA ATT CCA AAA GAT GGT TTA 48 
Met He Tyr Leu Tyr Thr Ala Asp Asn Val He Pro Lys Asp Gly Leu 
15 10 15 

CAA GGA GCA TTT GTC GAT AAA GAC GGT ACT TAT GAC AAA GTT TAC ATT 96 

ly Thr Tyr Asp Lys Val 
25 30 



40 



45 Gln Gly Ala P 2o Val ASP LyS ASP Gly Thr Tyr Asp Lys Val Tyr Ile 



CTT TTC ACT GTT ACT ATC GGC TCA AAG AGA ATT GTT AAA ATT CCG TAT 144 
Leu Phe Thr Val Thr Ile Gly Ser Lys Arg Ile Val Lys Ile Pro Tvr 
35 40 45 

ATA GCA CAA ATG TGC TTA AAC GAC GAA TGT GGT CCA TCA TCA TTG TCT 192 
He Ala Gin Met Cys Leu Asn Asp Glu Cys Gly Pro Ser Ser Leu Ser 
50 55 60 



55 AGT CAT AGA TGG TCG ACG TTG CTC AAA GTC GAA TTA GAA TGT GAC ATC 
Ser His Arg Trp Ser Thr Leu Leu Lys Val Glu Leu Glu Cys Asp Ile 
65 70 75 80 



240 



GAC GGA AGA AGT TAT AGT CAA ATT AAT CAT TCT AAA ACT ATA AAA CAG 288 
Asp Gly Arg Ser Tyr Ser Gin Ile Asn His Ser Lys Thr Ile Lys Gin 
85 90 95 



ATA ATG ATA CGA TAC TAT ATG TAT TCT TTG ATA GTC CTT TTC CAA GTC 336 

He Met Ile Arg Tyr Tyr Met Tyr Ser Leu Ile Val Leu Phe Gin Val 

100 105 • no 

CGC ATT ATG TAC CTA TTC TAT GAA TAC CAT TA 369 
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Arq lie Met Ty^Blu Phe Tyr Glu Tyr His 

115 120 

5 (2) INFORMATION FOR SEQ ID NO: 66: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 122 amino acids 

(B) TYPE : amino acid 
10 (D ) TOPOLOGY: linear 



15 



(ii) MOLECULE TYPE: protein 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:66: 

Met lie Tyr Leu Tyr Thr Ala Asp Asn Val He Pro Lys Asp Gly Leu 

c 10 15 



20 Gin Gly Ala Phe Val Asp Lys Asp Gly Thr Tyr Asp Lys Val Tyr He 
20 25 30 

Leu Phe Thr Val Thr He Gly Ser Lys Arg He Val Lys He Pro Tyr 
35 40 45 

25 

He Ala Gin Met Cys Leu Asn Asp Glu Cys Gly Pro Ser Ser Leu Ser 
50 55 60 

Ser His Arg Trp Ser Thr Leu Leu Lys Val Glu Leu Glu Cys Asp lie 
30 65 70 75 80 

Asp Gly Arg Ser Tyr Ser Gin He Asn His Ser Lys Thr He Lys Gin 
85 90 95 

35 He Met He Arg Tyr Tyr Met Tyr Ser Leu He Val Leu Phe Gin Val 
100 105 110 

Arq He Met Tyr Leu Phe Tyr Glu Tyr His 
115 120 
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WHAT IS CT. 




ED IS: 



An isolated peptide of at least 5 amino acids comprising a unique portion of 



2. An isolated peptide according to claim 1 wherein said semaphorin comprises 
a human semaphorin. 

3. An isolated antibody that specifically binds a peptide according to claim 1. 



4. An isolated nucleic acid comprising a nucleotide sequence encoding a 
peptide according to claim 1 wherein said sequence is joined to a nucleotide not 
naturally joined to said sequence and said sequence is other than that of the A39 
ORF of vaccinia virus. 



5. A cell comprising a nucleic acid according to claim 3. 

6. A transgenic rodent comprising a nucleic acid according to claim 7 wherein 
said nucleic acid is xenogeneic to said rodent. 



7. A process for the production of a recombinant unique portion of a 
semaphorin comprising culturing the cell of Claim 4 under conditions suitable for 
the expression of said peptide, and recovering said peptide. 

8. A method of identifying a pharmacological agent useful in the diagnosis or 
treatment of disease associated with the binding of a semaphorin to a semaphorin 
receptor, said method comprising the steps of: 

contacting a panel of prospective agents with a peptide according to claim 

i; 

measuring the binding of a plurality of said prospective agents to said 
peptide; 

identifying from said plurality a pharmacological agent which specifically 
binds said peptide; 



a semaphorin, and said peptide has a semaphorin binding specificity. 



5 



10 



15 



20 
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wherein saf^iarmacological agent is useful in th^Rgnosis or treatment 
of disease associated with the binding of a semaphorin to a cellular receptor. 

9. A method of diagnosing a patient for a predisposition to neurological disease 
5 associated with a genetic locus, said method comprising the steps of: 
isolating somatic cells from a patient; 
isolating genomic DNA from said somatic cells; 
contacting said genomic DNA with a with a probe comprising a DNA 
sequence encoding a peptide according to claim 1 under conditions wherein said 
10 probe hybridizes to homologous DNA; 

identifying a region of said genomic DNA which hybridizes with said 

probe; 

wherein the presence, absence or sequence of said region correlates with a 
predisposition to a neurological disease. 

15 

10. A method of treating a patient with neurological injury or disease or a 
pathological viral infection, said method comprising the steps of: 

administering to a patient a therapeutically effective dosage of a 
pharmaceutical composition comprising a pharmaceutically acceptable carrier and a 
20 peptide according to claim 1; 

wherein said peptide modulates neural cell growth cone function or viral 

pathogenicity in said patient. 

11. An isolated polypeptide comprising an amino acid sequence substantially 
25 similar to that of a semaphorin, and said polypeptide has a semaphorin binding 

specificity. 

12. An isolated peptide of at least about 5 amino acids comprising a unique 
portion of a semaphorin receptor, and said peptide has a semaphorin receptor 

30 binding specificity. 

13. An isolated antibody that specifically binds a peptide according to claim 
12. 
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14. An isolated nucleic acid comprising a nucleotiOT^equence encoding a 
peptide according to claim 12 wherein said sequence is joined to a nucleotide not 
naturally joined to said sequence. 



5 15. 



A cell comprising a nucleic acid according to claim 14. 



16. A process for the production of a recombinant unique portion of a 
semaphorin receptor peptide according to claim 12 comprising culturing the cell of 
Claim 14 under conditions suitable for the expression of said peptide, and 

10 recovering said peptide. 

17. A method of identifying a pharmacological agent useful in the diagnosis or 
treatment of disease associated with the binding of a semaphorin to a cellular 
receptor, said method comprising the steps of: 

15 contacting a panel of prospective agents with a peptide according to claim 

12; 

measuring the binding of a plurality of said prospective agents to said 
peptide; 

identifying from said plurality a pharmacological agent which specifically 
20 binds said peptide; 

wherein said pharmacological agent is useful in the diagnosis or treatment 
of disease associated with the binding of a semaphorin to a cellular receptor. 

18. A method of diagnosing a patient for a predisposition to neurological disease 
25 associated with a genetic locus, said method comprising the steps of: 

isolating somatic cells from a patient; 
isolating genomic DNA from said somatic cells; 
contacting said genomic DNA with a with a probe comprising a DNA 
sequence encoding a peptide according to claim 12 under conditions wherein said 
30 probe hybridizes to homologous DNA; 

identifying a region of said genomic DNA which hybridizes with said 

probe; 
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wherein th^resence, absence or sequence of said region correlates with a 
predisposition to a neurological disease. 



19. A method of treating a patient with neurological injury or disease or a 
5 pathological viral infection, said method comprising the steps of: 

administering to a patient a therapeutically effective dosage of a 
pharmaceutical composition comprising a pharmaceutical^ acceptable carrier and a 
peptide according to claim 12. 

wherein said peptide modulates neural cell growth cone function or viral 
10 pathogenicity in said patient. 

20. An isolated polypeptide comprising an amino acid sequence substantially 
similar to that of a semaphorin receptor, and said polypeptide has a semaphorin 
receptor binding specificity. 
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BOX U. OBSERVATIONS WHERE UNITY OF INVENTION WAS LACKING 
This ISA found multiple inventions as follows: 

This appUcation contains the following inventions or groups of inventions which arc not so linked as to form a 
single inventive concept under PCT Rule 13.1. In order for all inventions to be examined, the appropriate additional 
examination fees must be paid. 

Group I, claims 1, 2, 7, 8 and 11, drawn to semaphorin peptides with semaphorin binding specificity, a method for 
producing said peptides, and a method for screening potential pharmaceuticals using said peptides. 

Group II, claim 3, drawn to an antibody against the peptide of 1. 

Group III, claim 4, drawn to a nucleic acid encoding a peptide of I. 

Group IV, claims 5 and 6, drawn to a cell and a rodent containing the nucleic acid of 01. 

Group V, claim 9, drawn to a diagnostic method using the nucleic acid of III. 

Group VI, claim 10, drawn to a treatment method using the peptide of I. 

Group VTJ, claims 12, 17 and 20, drawn to semaphorin peptides having semaphorin receptor binding specificity, and a 
method for screening potential pharmaceuticals using said peptides. 
Group VIII, claim 13, drawn to an antibody against the peptide of VII. 

Group IX, claim 14, drawn to a nucleic acid encoding the peptide of VII. 

Group X, claims 15 and 16, drawn to a ceU containing the nucleic acid of IX and a method of producing the peptide of 
VII. 

Group XI, claim 18, drawn to a diagnostic method using the nucleic acid of IX. 
Group XH, claim 19, drawn to a treatment method using the peptide of VII. 

The inventions listed as Groups I-XII do not relate to a single inventive concept under PCT Rule 13.1 
because, under PCT Rule 13.2, they lack the same or corresponding special technical features for the following reasons: 

Groups I-VT are distinct from each of groups VTI-XII because i-VI and VH-XII are drawn to compositions and methods 
containing and utilizing two different classes of peptides, those which bind semaphorin and those which bind 
semaphorin receptor. The compositions and methods of I-VI do not require the compositions and methods of Vn-Xn, 
and the compositions and methods of VI1-X1I do not require the compositions and methods of I-VI. 

Group II is distinct from each of I and III-VI because the antibody of II is not required for the methods and 
compositions of I and ffl-VI, and the methods and compositions of III-VI are not required to produce the antibody of U. 
While the peptide of I can be used to elicit production of the antibody of II, the peptide can be used for other purposes 
as well, such as the screening and treatment methods of I and VI. 

Group HI is distinct from each of Groups I and V, because they are related as product and process of use. The product 
of ID can be used for several different processes, for example the divergent processes of I and V. 

Group I is distinct from each of groups IV and V because the compositions and methods of I are not required forthc 
compositions and methods of IV and V, and the compositions and methods of IV and V are not required for I. The 
peptides of I can be obtained without the cells of IV, for example by chemical synthesis. 

Groups I and VI are distinct because the method of VI is not required for the compositions and methods of I, and the 
peptide of I can be used for other methods, such as the screening method of claim 8. 

Groups m and IV are distinct because they are related as intermediate and final product. The intermediate (in) can be 
used for other purposes, such as the methods of I and V. 

Groups in and VI are distinct because the composition of III is not required for the method of VI and the method of VI 
is not required for the composition of in. 
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Group IV is distinct from each of groups V and VI because the compositions of IV are not required for the methods of 
V and VI, and the methods of V and VI are not required to produce the compositions of IV. 

Groups V and VI are distinct because the two methods require different procedures and starting materials to achieve 
divergent ends. 

Group VIII is distinct from each of VII and IX-XII because the antibody of VIII is not required for the methods and 
compositions of VII and IX-XII, and the methods and compositions of IX-XII are not required to produce the antibody 
of VIII. While the peptide of VII can be used to elicit production of the antibody of VIII, the peptide can be used for 
other purposes as well, such as the screening and treatment methods of VII and XII. 

Group IX Is distinct from each of Groups X and XI, because they are related as product and process of use. The 
product of IX can be used for several different processes, for example the divergent processes of X and XI. 

Group VII is distinct from each of groups IX and XI because the compositions and methods of VII are not required for 
the compositions and methods of XI and XI, and the compositions and methods of IX and XI are not required for VII. 

Groups VII and X are related as product and process of making. The peptide of VII can be produced without the 
method of X, for example by chemical synthesis. 

Groups VII and XII are distinct because the method of XII is not required for the compositions and methods of VII, and 
the peptide of VII can be used for other methods, such as the screening method of claim 17. 

Groups IX and XII are distinct because the composition of IX is not required for the method of XII and the method of 
XII is not required for the composition of IX. 

Group X is distinct from each of groups XI and XII because the compositions of X are not required for the methods of 
XI and XII, and the methods of XI and XII are not required to produce the compositions of X. 

Groups XI and XII are distinct because the two methods require different procedures and starting materials to achieve 
divergent ends. 

Accordingly the claims are not so linked by a special technical feature within the meaning of PCT Rule 13.2 so as to 
form a single inventive concept. 



Form PCT/ISA/210 (extra sheet)(July 1992)* 



BN8DOCID: <WO 9607706A1 1 > 



